Python Pandas For Excel on vBrownBag - show notes
21 Aug 2019
Here are show notes from my Python Pandas for Excel webinar episode on Chris Williams’s marvelous Python for DevOps series on the vBrownBag channel:
From the comments section:
“Katie … shows that you don’t need to be a data scientist to be able to use Pandas effectively in a business environment.”
Links
Environment Setup
- Step-by-step instructions: Install Python on your Windows machine
- Not ready to install Python? Start coding immediately at Repl.it
- Need an idea? Play with tonight’s live-code
Sample Data
- A bunch of sales-ey-looking data as Excel files in various sizes from “E For Excel”
- My 3 go-to sample files I use in almost all tutorials:
Pandas Tutorials
- An anonymous StackAbuse contributor’s Beginner’s Tutorial on the Pandas Python Library
- Category: Explain Programming Like I’m Five (but only the parts I care about)
- This is my favorite link if you hate trying to program without truly understanding your “data types.”
- It does a nice “explain it like I’m five” about Pandas “DataFrames” and Pandas “Series.”
- Ready for more? Deepak “Daksh” K’s Pandas DataFrame: A lightweight Intro isn’t quite as “ELI5” and feels a bit more “math-class”-ey, but I still like it.
- Bhavani Ravi’s Python Pandas – Basics to Beyond
- Category: Quick-start, Excel-specific
- Warning: this awesome article’s CSS got messed up in the actual code samples sometime since it was published (probably during the Medium -> HackerNoon migration).
If you’re on a desktop computer and comfortable using the developer tools built into your browser, right-click on the black-backgrounded code snippets, click “Inspect Element,” and hand-edit the CSS fromline-height: 1.5;
toline-height: 3;
in thecode[class*="language-"]
section of the CSS controlling the text.
- Chris Moffitt’s Common Excel Tasks Demonstrated in Pandas, part 2 and Common Excel Tasks Demonstrated in Pandas, part 2
- Category: Quick-start, Excel-specific
- Chris’s Practical Business Python site is a blog, so I recommend starting at the oldest posts in the archives and working your way up, since blogs tend to start basic and grow more advanced over time if you check out the rest of his material.
- Chris also did a great episode, Escaping Excel Hell with Python and Pandas, in February 2019 on the Talk Python To me podcast. Full transcript available if you prefer over audio!
- Chris himself has a listicle of books on his site as well.
- Ankit Gandhi’s Replacing Excel with Python
- Category: Quick-start, Excel-specific
- Harish Garg’s Tutorial Using Excel with Python and Pandas
- Category: Quick-start, Excel-specific
- Kevin Markham’s #PandasTricks “cool feature of the day” tweets. Ongoing on Twitter – just started summer 2019!
- Category: Short attention span
- Daniel Chen’s Pandas For Everyone (link 2)
- Category: Full-on book
- I haven’t read this, but Kelly and Sean of the awesome Teaching Python podcast recommended it. You can see a big sample here if you’d like to know what his writing style is like.
- Pete Houston’s Read CSV file using pandas
- Category: Feature tutorial
- Mostly overkill about the
read_csv()
function if you’ve already played around, but I like that it covers handling ultra-ultra-large CSV files. - (Tip: When you’re ready to write out your transformed “chunks” to one ultra-ultra-large CSV file, use the
to_csv()
function in “append” mode.) - Ultra-ultra-large is probably bigger than you think it is. You’re going to love how much faster things run just by using Python instead of Excel without worrying about this.
- Shane Lynn’s Using iloc, loc, & ix to select rows and columns in Pandas DataFrames
- Category: Feature tutorial
- If you hit a wall selecting cells of a Pandas DataFrame to edit, and someone suggests you use
.loc[]
, and you wonder what on earth that is, this article should help. - Also good for learning how to grab specific row numbers out of a spreadsheet.
- Kevin Markham’s Top 8 resources for learning data analysis with Pandas listicle
- Category: Link listicle
- My Pandas Quick Examples
- Category: Hands-On Exercises
- If you like to test the waters on a new skill with little “just jump in and try it” challenges, read this and try the handful of “test yourself!” exercises.
- My Python for Excel 101 hands-on training course video.
- Category: Explain Programming Like I’m Five (but only the parts I care about)
- A 2.5-hour video is probably not the fastest way to learn Pandas, but if you’re new to programming, this might be useful.
- We really dig into data types, DataFrames, Series, etc. – but only as they pertain to getting you up and running with Pandas for spreadsheet editing.
- It says it’s “for Salesforce administrators,” but that’s just who my audience was that particular day.
Pandas References
- Official documentation of Pandas commands
- Official documentation of Pandas commands available for every “Series”-typed expression
- Official documentation of Pandas commands available for “Series”-typed expressions that contain plaintext cells
- Official documentation of Pandas commands available for every “DataFrame”-typed expression
(There are fewer useful ones than for Series, in my opinion … how often do you really do a mass edit to an entire spreadsheet at once?)
- Mark Graph’s Pandas DataFrame cheat sheet (PDF)
- A quick-reference guide to useful things you can do once you’ve loaded data into a Pandas “DataFrame” object
- Karlijn Willems’s Pandas Cheat Sheet for Data Science in Python
- From 2016, and Pandas is always introducing more concise commands for doing things, so if something looks annoying, see if it’s been improved since then.
- Irv Lustig’s Data Wrangling With Pandas cheet sheet (PDF)
- My Python and Pandas CSV Processing Operations cheat sheet
- Python & Pandas documentation can be cryptic for new programmers to read (e.g. not familiar with reading “function signatures”).
- I had this crazy idea that I coud somehow “improve” upon it for beginners by describing Python/Pandas inputs to functions & operations as “put something here” placeholders like
°°°
or°°°1°°°
&°°°2°°°
. - I’m not sure it really came out like I meant it to, but if it helps you visualize the way to put “expressions” together as you get started … yay!
General-Purpose Python Tutorials & References
- Jessica Garson’s Resources for Learning Python link listicle
- Be sure to check out the comments for more great links
- Chris Albon’s “How to do X with Python” cheat sheets
Misc
- Julia Evan’s Asking Good Questions – a great way to get good help on StackOverflow!
- A Reddit argument about whether sysadmins should abandon command-line tools for Python