Python

Python
How to Align Items on Two Lists Despite Spelling Variations Using Python

Sometimes you may encounter two backup folders where most files appear identical. But, to your annoyance, some are missing in one folder or the other, and some are named slightly differently, and to create a complete backup folder, you don't know which ones to keep and which ones to throw away. The following Python code uses Levenshtein distance to align the items on two lists. First, you need to install a library for Levenshtein, as well as Pandas, by opening a terminal and entering "pip install python-Levenshtein" or "pip install levenshtein." Lists 1 and 2 show recent box-office movies, but they are not identical and have spelling variations. Let's align […]

Read more
data science
Data-Driven Science and Engineering. Chapter 3 Exercises

I used Python and worked on exercises in Chapter 3 of Data-Driven Science and Engineering, 2nd Edition (2022).

Read more
data science
Data-Driven Science and Engineering. Chapter 2 Exercises

I used Python and worked on exercises in Chapter 2 of Data-Driven Science and Engineering, 2nd Edition (2022). I started off with an easier equation, the heat equation, by modernizing the book authors' Python code. The obsolete spicy.integrate.odeint function for ordinary differential equations is now replaced with solve_ivp in the same library. For the KdV equation, the following part of the code is replaced. \(u u_x\) is transformed by using \(\widehat{u u_x} = \int_{-\infty}^{\infty}uu_x e^{-i\kappa x} dx = \int_{-\infty}^{\infty}\frac{1}{2}\frac{d(u^2)}{dx} e^{-i\kappa x} dx = \frac{1}{2}i\kappa\widehat{u^2}\). Hide · Rush Hide · Rushnoisy Hide · Rush Cleaned

Read more
data science
Data-Driven Science and Engineering. Chapter 1 Exercises

I used Python and worked on exercises in Chapter 1 of Data-Driven Science and Engineering (2022).

Read more
data science
Solution to Garbled Double-Byte Fonts in Mac Matplotlib-Generated PDFs

Double-byte fonts like Japanese and Chines ones are garbled in Mac matplotlib-generated PDFs when viewed in Adobe Acrobat. This article provides a solution.

Read more
data science
You Need to Replace Miniforge Version of Miniconda to Get Latest one

My Mac's Miniconda was old (version 4.11.0). There seemed to be a much newer version 22.11.1 available, but when I typed 'conda update -n base conda' in the terminal as instructed, the update didn't take effect (see the output below). It also looked like various issues had accumulated. Upon recollection, I installed Miniconda with Miniforge when the official Conda version for Apple Silicon was still unavailable. Conda is an open-source package management and environment management system for installing multiple versions of software packages and their dependencies and switching easily between them. It is commonly used for data science, scientific computing, and machine learning. Miniconda is a minimal distribution of Conda. […]

Read more
Python
The Top 10 Reasons Why Non-Programmers Should Learn Python

Python is a powerful, versatile programming language widely used in many fields, from web development to data analysis, machine learning, and scientific computing. It is known for its simple, easy-to-learn syntax, making it a great choice for beginners who want to learn to program. In recent years, Python has become one of the most popular programming languages in the world and is in high demand for various job markets and industries. One of the main benefits of learning Python for non-programmers is its ease of use. Its syntax is similar to natural language, making it easy for beginners to understand and learn programming basics. Additionally, Python has clear and readable […]

Read more
data science
Collaborating on Jupyter Lab: How to Improve Your Data Science Workflow as a Team

Jupyter Lab is a powerful open-source web-based platform for interactive computing and data science. It allows users to create and share documents that contain live code, equations, visualizations, and narrative text. The platform has become increasingly popular among data scientists and researchers, and it's easy to see why. In this article, we will explore how Jupyter Lab can be used for collaboration and teamwork and how it can help organizations to improve their data science workflow. Collaboration features in Jupyter Lab Collaboration is a crucial aspect of data science and research. Jupyter Lab provides several features that make it easy for teams to work together. The platform allows multiple users […]

Read more
Python
Base64, PyScript, and WordPress

In a recent post, I found that WordPress misinterpreted indents and returned errors when I tried to run PyScript codes on it. It might be because WordPress detoxes the codes to turning them unfunctional for security reasons. So, I converted them into base64 to evade WordPress' watch. Code with no indent First, a code with no indent. I converted the above code into base64 and pasted into the py-script tag as src. The first two lines urge the browser to import the PyScript program. Results ↑No problem with hello world. Code with indents Next, a code with indents. Its result is what I am interested in. The above code was […]

Read more
Python
Brief Review of PyScript on WordPress

Anaconda has introduced PyScript at PyCon US 2022. It enables client-side execution of Python codes in web browsers, and they say it is already compatible with Numpy, Pandas, and other third-party modules. I gave it a quick try on WordPress. Environment WordPress 5.9.3 Lightning 14.20.3 Safari 15.4 or Google Chrome 101.0.4951.54 MacOS 12.3.1 Apple M1 Pro Examples The custom HTML block seems to be the best place for PyScript codes. Some simplest codes work there. Results print("Hello, world!") Here comes the greeting! However, PysScript looks to override the default CSS settings, and the bold headings on this page are no longer bold. Similarly, bullet points of a list are missing. Whether […]

Read more
Python
Notes on the Book "Effective Pandas"

The book I introduced in the last post has turned out to be so fantastic that I have decided to make notes of it. This is a live document; I started from Chapter 21 and am going on from there. I will return to the earlier chapters if I still have time and power. The author recently appeared in Real Python Podcast: Becoming More Effective at Manipulating Data with Pandas and talked about the book. Introduction Installation Data Structures Seris Introduction Series Deep Dive Operators (& Dunder Methods) Aggregate Methods Conversion Methods Manipulation Methods Indexing Operations String Manipulation Date and Time Manipulation Plotting with a Series Dates in the Index […]

Read more
Python
Book: Effective Pandas

Harrison, M. (2021) Effective Pandas: Patterns for Data Manipulation. Independently published. Pandas is a Python library for data analysis and visualization, and I use it almost every day. Python Podcast's interview with the Pandas guru Matt Harrison led me to buy his book "Effective Pandas." It walks you through the library and demonstrates the way to use it effectively in data analyses. I'm just halfway through but am much impressed at the tips and the philosophy beneath, and in particular, at the simple but revolutionary "chaining" syntax, in which you write a method per line, chained one after another (see the example code below.) It makes code development much easier […]

Read more