Beyond toy data sets

Blog
June 24, 2019

D2D CRC’s most recent interns Kenley Walter and Tobin South learnt a lot about foosball this summer. But they assure us the value doesn’t stop there.

We’re talking large-scale enterprise applications, multilayered convolutional and recurrent deep neural networks, Agile methodologies and Natural Language Processing. Buzz words aside, both agree that the greatest value was the chance to move beyond toy data sets and scrutinise real world problems alongside D2D CRC’s data science masterminds.

What are you currently studying at university?

Kenley: I’m currently studying a Bachelor of Software Engineering at the University of South Australia.

Tobin: I just finished my Bachelor’s Degree in Mathematical and Computer Science majoring in Statistics and Applied Maths. I am about to start my Masters of Philosophy in Applied Maths doing research into the flow of disinformation and fake news online and the dynamics of echo-chambers in social media.

Why did you decide to do an internship and what drew you to apply for D2D CRC?

Kenley: I was looking for ways to grow my skills to better prepare for the real world and found that internships were a great way to reinforce my skills and gain practical experience by working as a team on real projects that have a real impact. I have a passion for writing great software that analyses data and D2D CRC presented a unique opportunity that had a strong focus on finding solutions to problems using data, which had the added benefit of being in Adelaide.

Tobin: I knew a few people who had worked at, or with, D2D CRC. From every one of them, I had heard about the fun and inclusive workplace culture as well as the exciting and interesting work being done here. In particular, the work done by the Beat the News team was really interesting, and the opportunity to work on a project in that space aligned perfectly with my interests.

Tell us about your first day at D2D CRC. What did it involve? What were your first impressions?

Kenley: I was introduced to the Apostle team on my first day and began my work on the project after I had my first stand-up meeting. Everyone was incredibly welcoming, I met lots of amazing people and got to know more about the really exciting projects being worked on at D2D CRC. The relaxed atmosphere is one of the things you first notice, along with the competitive foosball games; one of the things almost everyone at D2D CRC seems to be interested in.

Tobin: The first thing that strikes you about the D2D CRC offices at Base64 is the atmosphere. You don't walk through big offices full of people in suits - you walk into rooms filled by focused programmers staring at code while they mull over a coffee.

As it turned out, these programmers and the rest of the staff at D2D CRC are a welcoming team full of interesting people.

These welcoming people do have their competitive side, and when the lunch break came, I discovered the competitive joy of a game of foosball.

The first day wasn't all fun and foosball, I got cracking on my problem as soon as I could and with the help of those around me, I was on my way to making something impressive.

How did your experience evolve over your time at D2D CRC? What kinds of projects were you working on and what did you learn?

Kenley: Throughout the internship, I worked on the Apostle project, adding additional features to the existing application. This was the first time I had worked on a large scale enterprise application that had integration testing and a thorough development workflow, so I learnt a lot about the Spring Boot framework and how all the components from the front and back end work together to form a cohesive application.

Tobin: D2D CRC works on a variety of projects, but mine focused on developing a model to predict the emotions contained within a text conversation. This field, called Natural Language Processing, is a particularly interesting and challenging space and I was very excited to be working in it.

The internship took me through the whole pipeline of developing a model from feature exploration and engineering through to developing a finalised model capable of deployment.

As part of this, I got to explore a variety of modelling techniques from simple linear models all the way through to multilayered convolutional and recurrent deep neural networks. It's one thing to learn about these techniques in a class, but using them on a real problem lets you find the flaws and quirks of models in a way that reading never will.

If you could think of one thing that defined your experience at D2D CRC, what would it be?

Kenley: Working with an amazing team would be the one thing that defined my experience at D2D CRC. The Apostle project is a large scale application comprising of multiple components, and it was the first time I had worked on an a large scale application. Whenever I had a technical question or needed another set of eyes for debugging, there were always people who were happy to provide a guiding hand and help me learn.

Tobin: Without a doubt, the thing that has defined my experience is mentorship. As an intern, you're not just here to do; you're here to learn and grow. The technical problems you face are usually hard, that’s what makes them worth pursuing, but overcoming them is a challenge on your own. Without my mentors on this project, I could not have done anywhere near as well or learnt as much as I have.

What skills will you take from your time at D2D CRC?

Kenley: This internship has taught me a lot about working in a team on a real large scale project. It’s definitely the first time I’ve done so and I’ve learnt about working with enterprise tools, as well as the importance of having a comprehensive project workflow. I also learnt about Agile methodologies by attending regular stand-ups, sprint sessions and demo sessions.

Tobin: The challenge of learning data science is that no matter how much you learn by playing with toy data sets, in the real world, the stakes are different. This internship has given me the chance to question real data and explore it in all of its flaws and nuances. Part of that comes from knowing when to push through problems and when to move past them.

What advice would you give others looking to do an internship or work experience during their studies?

Kenley: An internship or work experience is perhaps the best way to learn relevant new skills, as it gives you the opportunity to work on real projects. It’s an excellent way to gain industry experience by learning important skills used in the real world, such as Agile methodologies and development workflows.

Tobin: Find an organisation that does work you are passionate about. The most import part of an internship or work experience is that you learn new skills and enjoy your time doing so. Find a place solving interesting problems using skillsets you want to develop.

The new generation of data scientists

Across its five-year life time, D2D CRC has hosted and mentored 29 interns, and has supported 20 Honours students. We’ve also created a community of 71 PhD students across seven universities.

D2D CRC’s focus has always been increasing the sustainability of the data science workforce. A large component of this involves engaging with, and developing the future workforce through PhD scholarships, Honours scholarships and internships.  

We like to think we’ve offered an environment that’s rich in growth by creating opportunity, and fostering mentorship. What we can say for sure is that the contribution these early career researchers have made to our work at D2D CRC is immense.

Kenley and Tobin are the last of D2D CRC’s interns, with the CRC set to wind up on 30 June 2019.

DOWNLOAD PUBLICATION