I just got back from visiting my graduate program at Texas McCombs and had the pleasure of running a workshop on how to craft a digital portfolio for analytics based projects. I was inspired by the creatives in my life who showcase their work and I wanted to create a space where other data scientists or potential employers could see mine.
In this post, I’ll talk about
- What a portfolio is and why you need one
- How to set up a data project in your portfolio
- Best practices for portfolio creation
- Portfolio inspiration and favorite technology
A portfolio should…
If nothing else, these are the most important takeaways. Your portfolio tells a story about who you are and what you can do. Whether you are a data scientist or analyst, its important to not only show how technically skilled you are, but how well you can communicate insights and recommendations.
Start by asking yourself:
- Who am I?
- What do I want in my next role?
- What work have I done and what can I do that shows those relevant skills?
Structure of a Data Project
Computer programmers and scientists will disagree with me, but I don’t think any business person or executive will. You have to assume that when someone is reading about your project, they can leave at any minute, so you have to grab their attention at the beginning and tell them the outcomes first. Its the same theory of presenting your recommendation first to a CEO. They are busy and they might have to leave a presentation 30 minutes early. You want to make sure they got your most important points so you should structure your project in this order:
Too long; didn’t read. This is your synopsis, your abstract, whatever you want to call it. You have to assume this might be the only part of your project someone actually reads. Keep it short and although you will have an area to discuss findings, spoil the ending for them.
“During an audit of CVS’s social media presence, I found that 75% of their share of voice came from employees posting job descriptions for pharmacists, which does not align with their goal of attracting more analytical talent”
Any analysis without purpose is going to be weak. Always think about what your hypothesis is before you start an analysis and try to reject it. Stating it at the beginning of your project gives the viewer insights into how you were thinking about the analysis.
“My hypothesis is that CVS is losing talent to large tech firms like Amazon, Google, and Facebook instead of to a competitor like Walgreens.”
Who gives a shit? No seriously. Why does anyone care about the analysis you just did? What did you find that impacts the business? Dive into what you found, but keep it relevant to the hypothesis you set for yourself or the request of a client or assignment. It might be cool that you wrote an algorithm that can predict box office success using social media mentions, but what do we do with that? How does it help? Don’t be afraid to go deep here, but keep it relevant.
“I found that the click-through rate for the analytics job site is 25% lower than the industry average and that drop off occurs when they visit the student job posting page. Upon inspection, we found that the search function was not very user-friendly, and students were leaving before looking at any available jobs.”
At this point, the people who REALLY care, are still around. You’ve probably caught their attention and they’re asking themselves “How did they find this out? What data did they use? What analysis did they perform?” Your analysis should answer some of the following questions:
- How did you decide what data to use?
- Where did you get the data?
- Did you have to clean or transform the data?
- What did you find when explored the data?
- What models did you use to come to your final result?
- Where there any challenges and how did you overcome them?
Visualizations and Data
Be sure to include visualizations across your project. They are a great way for more visual learners to understand the data. When possible, always include a link to your code and data. The project should be replicable.
A digital portfolio should be an extension of your resume. It can be long form, but it should have a purpose. Some Do’s and Don’ts around creating a digital portfolio:
- Create a professional URL
- Link to your GitHub or LinkedIn
- Use visualizations
- Keep it concise
- List portfolio URL in your LinkedIn profile and resume
- Practice good SEO (especially as a marketer)
- Mix personal and professional content
- Drag on about your inspiration for this project. No one cares
- Link to accounts with no information
- Go overboard with designs.
Play to your strengths, if you have a web development background use it! If you don’t know what colors match, use a free pre-defined template and a cheat sheet for color codes. Don’t feel the need to switch up designs all the time. Find what works for you and keep at it.
I like to steal ideas from UX designers and other data scientists. I’ve outlined a few of my favorites:
iamtrask.github.io – This is a beautiful and well designed GitHub portfolio that I really appreciate. If you feel like you are less on the creative side, this is a great example of how to make very technical projects digestible.
Rachel Downs Wood is a fantastically talented data scientist at Facebook, who used her creativity to showcase her work as a data scientist. I always look to her for best practices on how to present my work.
Patrick Chew aka Mr. Chandler Nunez – Patrick is a product designer at Xbox (and my boyfriend) and I look to him as inspiration for great design. He is a great example of how to show great design just by being simple and clean.
Sarah Sadie Gilbert – The Rizzoli to my Isles. The Jobs to my Woz. Sadie is a user researcher formerly of WP Engine and is another great example of a digital portfolio. Her strength is highlighting the importance and methodology of her work as an information professional.
These are some of my favorite platforms for sharing work. I think any combination works just fine and there are others that are great that I didn’t mention. Personally, I think a site like WordPress is great for giving the flavor of who you are and the impact of your work. GitHub, aside from a great place to contribute to open source, is a great place to store data and code as a supplement to one’s website.
For those who are not as comfortable coding, Tableau and PowerBI are great tools to visualize data. There’s also nothing wrong with excel, as long as it looks good and makes sense.
For those that need design help, Canva is great for creating header images using whatever picture or background you want and the tools to overlay text. I personally use Color Calculator from Sessions College pretty frequently to know which colors complement each other when creating visualizations.
trinket.io is a great embedded IDE that you can put directly in your portfolio and anyone can run the code on your site.
If you’re not using Jupyter Notebooks or JupyterLab get on it! It’s a great tool to create and document code. I like it a lot more than any other Python IDE and you can even run R in it. I also love that you can use markup to make your documentation a lot prettier looking than inline comments.
I hope this post helps and if anyone has other best practices or suggestions, I would love to hear them. Just contact me or leave a comment!