Python
Stacked Bar Chart in Python - Advanced
In this advanced tutorial, we delve deeper into the art of creating stacked bar charts using Python. Building upon our previous basic tutorial, we explore more sophisticated techniques to handle complex data structures and add attributes to our visualizations. We utilize data from Our World in Data to craft a country and age demographic stacked bar chart, and then we take on a new challenge: visualizing sales data for car models by different manufacturers. We begin by installing necessary external libraries and importing data from goodcarbadcar.net into a Pandas dataframe. The tutorial guides you through the process of creating a ranking of year-to-date sales by brand and setting up the […]
Creating Stacked Bar Charts in Python: A Beginner’s Guide
Data visualization is an essential aspect of data science, allowing us to understand complex data sets at a glance. One of the most effective visual tools is the stacked bar chart, which can display multiple data series stacked on top of one another. In this article, we’ll explore how to create stacked bar charts in Python using a practical example. Preparing the Environment Before diving into the data, we need to set up our Python environment. This involves installing external libraries such as requests, pandas, matplotlib, and seaborn. These can be installed using either conda or pip, depending on your preference. Getting Started with the Data Our journey begins with the acquisition of data. For this tutorial, […]
How to Align Items on Two Lists Despite Spelling Variations Using Python
Sometimes you may encounter two backup folders where most files appear identical. But, to your annoyance, some are missing in one folder or the other, and some are named slightly differently, and to create a complete backup folder, you don't know which ones to keep and which ones to throw away. The following Python code uses Levenshtein distance to align the items on two lists. First, you need to install a library for Levenshtein, as well as Pandas, by opening a terminal and entering "pip install python-Levenshtein" or "pip install levenshtein." Lists 1 and 2 show recent box-office movies, but they are not identical and have spelling variations. Let's align […]
Data-Driven Science and Engineering. Chapter 3 Exercises
I used Python and worked on exercises in Chapter 3 of Data-Driven Science and Engineering, 2nd Edition (2022).
Data-Driven Science and Engineering. Chapter 2 Exercises
I used Python and worked on exercises in Chapter 2 of Data-Driven Science and Engineering, 2nd Edition (2022). I started off with an easier equation, the heat equation, by modernizing the book authors' Python code. The obsolete spicy.integrate.odeint function for ordinary differential equations is now replaced with solve_ivp in the same library. For the KdV equation, the following part of the code is replaced. \(u u_x\) is transformed by using \(\widehat{u u_x} = \int_{-\infty}^{\infty}uu_x e^{-i\kappa x} dx = \int_{-\infty}^{\infty}\frac{1}{2}\frac{d(u^2)}{dx} e^{-i\kappa x} dx = \frac{1}{2}i\kappa\widehat{u^2}\). Hide · Rush Hide · Rushnoisy Hide · Rush Cleaned
Data-Driven Science and Engineering. Chapter 1 Exercises
I used Python and worked on exercises in Chapter 1 of Data-Driven Science and Engineering (2022).
Solution to Garbled Double-Byte Fonts in Mac Matplotlib-Generated PDFs
Double-byte fonts like Japanese and Chines ones are garbled in Mac matplotlib-generated PDFs when viewed in Adobe Acrobat. This article provides a solution.
You Need to Replace Miniforge Version of Miniconda to Get Latest one
My Mac's Miniconda was old (version 4.11.0). There seemed to be a much newer version 22.11.1 available, but when I typed 'conda update -n base conda' in the terminal as instructed, the update didn't take effect (see the output below). It also looked like various issues had accumulated. Upon recollection, I installed Miniconda with Miniforge when the official Conda version for Apple Silicon was still unavailable. Conda is an open-source package management and environment management system for installing multiple versions of software packages and their dependencies and switching easily between them. It is commonly used for data science, scientific computing, and machine learning. Miniconda is a minimal distribution of Conda. […]
The Top 10 Reasons Why Non-Programmers Should Learn Python
Python is a powerful, versatile programming language widely used in many fields, from web development to data analysis, machine learning, and scientific computing. It is known for its simple, easy-to-learn syntax, making it a great choice for beginners who want to learn to program. In recent years, Python has become one of the most popular programming languages in the world and is in high demand for various job markets and industries. One of the main benefits of learning Python for non-programmers is its ease of use. Its syntax is similar to natural language, making it easy for beginners to understand and learn programming basics. Additionally, Python has clear and readable […]
Collaborating on Jupyter Lab: How to Improve Your Data Science Workflow as a Team
Jupyter Lab is a powerful open-source web-based platform for interactive computing and data science. It allows users to create and share documents that contain live code, equations, visualizations, and narrative text. The platform has become increasingly popular among data scientists and researchers, and it's easy to see why. In this article, we will explore how Jupyter Lab can be used for collaboration and teamwork and how it can help organizations to improve their data science workflow. Collaboration features in Jupyter Lab Collaboration is a crucial aspect of data science and research. Jupyter Lab provides several features that make it easy for teams to work together. The platform allows multiple users […]
Base64, PyScript, and WordPress
In a recent post, I found that WordPress misinterpreted indents and returned errors when I tried to run PyScript codes on it. It might be because WordPress detoxes the codes to turning them unfunctional for security reasons. So, I converted them into base64 to evade WordPress' watch. Code with no indent First, a code with no indent. I converted the above code into base64 and pasted into the py-script tag as src. The first two lines urge the browser to import the PyScript program. Results ↑No problem with hello world. Code with indents Next, a code with indents. Its result is what I am interested in. The above code was […]
Brief Review of PyScript on WordPress
Anaconda has introduced PyScript at PyCon US 2022. It enables client-side execution of Python codes in web browsers, and they say it is already compatible with Numpy, Pandas, and other third-party modules. I gave it a quick try on WordPress. Environment WordPress 5.9.3 Lightning 14.20.3 Safari 15.4 or Google Chrome 101.0.4951.54 MacOS 12.3.1 Apple M1 Pro Examples The custom HTML block seems to be the best place for PyScript codes. Some simplest codes work there. Results print("Hello, world!") Here comes the greeting! However, PysScript looks to override the default CSS settings, and the bold headings on this page are no longer bold. Similarly, bullet points of a list are missing. Whether […]
Notes on the Book "Effective Pandas"
The book I introduced in the last post has turned out to be so fantastic that I have decided to make notes of it. This is a live document; I started from Chapter 21 and am going on from there. I will return to the earlier chapters if I still have time and power. The author recently appeared in Real Python Podcast: Becoming More Effective at Manipulating Data with Pandas and talked about the book. Introduction Installation Data Structures Seris Introduction Series Deep Dive Operators (& Dunder Methods) Aggregate Methods Conversion Methods Manipulation Methods Indexing Operations String Manipulation Date and Time Manipulation Plotting with a Series Dates in the Index […]
Book: Effective Pandas
Harrison, M. (2021) Effective Pandas: Patterns for Data Manipulation. Independently published. Pandas is a Python library for data analysis and visualization, and I use it almost every day. Python Podcast's interview with the Pandas guru Matt Harrison led me to buy his book "Effective Pandas." It walks you through the library and demonstrates the way to use it effectively in data analyses. I'm just halfway through but am much impressed at the tips and the philosophy beneath, and in particular, at the simple but revolutionary "chaining" syntax, in which you write a method per line, chained one after another (see the example code below.) It makes code development much easier […]