{"id":8012,"date":"2021-10-12T12:46:00","date_gmt":"2021-10-12T10:46:00","guid":{"rendered":"https:\/\/clients.triloogia.ee\/proekspert\/wp-new\/?p=8012"},"modified":"2023-08-11T11:05:56","modified_gmt":"2023-08-11T09:05:56","slug":"does-pair-programming-work-in-data-science","status":"publish","type":"post","link":"https:\/\/clients.triloogia.ee\/proekspert\/wp-new\/blog\/bi-data-analytics\/does-pair-programming-work-in-data-science\/","title":{"rendered":"Does pair programming work in data science?"},"content":{"rendered":"\n<p class=\"article-lead\">Pair programming is common in coding. But does it work in data science? And with remote work? Our data scientists gave it a whirl.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"1024\" height=\"459\" data-src=\"https:\/\/clients.triloogia.ee\/proekspert\/wp-new\/wp-content\/uploads\/2022\/01\/BlogPicture_20220128-1024x459.png\" alt=\"\" class=\"wp-image-8351 lazyload\" data-srcset=\"https:\/\/clients.triloogia.ee\/proekspert\/wp-new\/wp-content\/uploads\/2022\/01\/BlogPicture_20220128-1024x459.png 1024w, https:\/\/clients.triloogia.ee\/proekspert\/wp-new\/wp-content\/uploads\/2022\/01\/BlogPicture_20220128-300x134.png 300w, https:\/\/clients.triloogia.ee\/proekspert\/wp-new\/wp-content\/uploads\/2022\/01\/BlogPicture_20220128-768x344.png 768w, https:\/\/clients.triloogia.ee\/proekspert\/wp-new\/wp-content\/uploads\/2022\/01\/BlogPicture_20220128-1536x688.png 1536w, https:\/\/clients.triloogia.ee\/proekspert\/wp-new\/wp-content\/uploads\/2022\/01\/BlogPicture_20220128.png 1600w\" data-sizes=\"(max-width: 1024px) 100vw, 1024px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 1024px; --smush-placeholder-aspect-ratio: 1024\/459;\" \/><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"pair-programming-defined\">Pair programming defined<\/h2>\n\n\n\n<p>In pair programming, two programmers work side by side at a single workstation. Like a rally team, they have separate but key tasks. One serves as the driver, writing code, while the other navigator, checks the driver\u2019s work. The roles should be swapped frequently. This is a key agile technique used at <a href=\"https:\/\/clients.triloogia.ee\/proekspert\/wp-new\/about-proekspert\/\" data-type=\"URL\" data-id=\"https:\/\/clients.triloogia.ee\/proekspert\/wp-new\/about-proekspert\/\">Proekspert<\/a> to speed the onboarding of new employees \u2013 and it\u2019s especially efficient in an era of remote work.\u00a0<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"two-heads-better-than-one\">Two heads better than one<\/h2>\n\n\n\n<p>At the beginning of October, <strong>Kaspar Hollo<\/strong> joined the data science team as our newest member and I took on the role of tech buddy. And since <a href=\"https:\/\/clients.triloogia.ee\/proekspert\/wp-new\/about-proekspert\/\" data-type=\"URL\" data-id=\"https:\/\/clients.triloogia.ee\/proekspert\/wp-new\/about-proekspert\/\">Proekspert<\/a> encourages pair programming as an onboarding technique for new engineers, doubly so in a time of remote work, it was the right time to try it out.\u00a0<\/p>\n\n\n\n<p>I have done a lot of pair data science in my scientific career, but mostly using \u201cpoint and click\u201d data analysis software, rather than developing one myself.&nbsp; I searched for a few articles about how to go about this activity, finding some <a href=\"https:\/\/seanamcclure.medium.com\/does-pair-programming-work-in-data-science-5a7b277b1485\">like this one<\/a> and <a href=\"https:\/\/medium.com\/ww-tech-blog\/the-case-for-pair-programming-for-data-science-for-2021-and-beyond-95402f71ba43\">another one<\/a>). These gave some useful tips and warnings about benefits and dangers, but they reminded me of cookbook instructions, rather than an evidence-based guide.<\/p>\n\n\n\n<p>In my experience, data science tasks can be so diverse and intermixed that general rules have a high likelihood of being oversimplified. So, I created my own guidelines for a working session:&nbsp;&nbsp;<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Define a goal you want to achieve.<\/li>\n\n\n\n<li>The \u201cdriver\u201d always writes code and constantly explains what they are doing. They alert their partner (who\u2019s \u201crides shotgun\u201d) when they get stuck.&nbsp;<\/li>\n\n\n\n<li>The person riding shotgun keeps an eye on the code, checking for mistakes, bad coding style, etc. The shotgun position stands ready to brainstorm if the driver gets stuck, and can split off to perform quick searches for algorithms, or write key pieces of code.&nbsp;<\/li>\n<\/ol>\n\n\n\n<p>In most pair programming sessions, I teamed up with Kaspar. We\u2019d worked on a traffic sign detection and localization project together. We used a single computer (at the office), worked in Jupyter Lab, and used Python.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"session-1-coding-with-kaspar\">Session 1: Coding with Kaspar&nbsp;<\/h3>\n\n\n\n<p>During a four-hour session I took the driver\u2019s role. I felt the number of times I got stuck was reduced due to being able to immediately brainstorm. The many relevant questions from Kaspar riding shotgun helped me examine the problem from different angles (quite literally, as the task was about triangulations in 3D space). Since Kaspar has taught Python to thousands of Estonians through MOOCs, I also learned a few coding tips and tricks and a few shortcuts that I didn&#8217;t know before.&nbsp;<\/p>\n\n\n\n<p>There were multiple positives to the whole session:&nbsp;<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>As the driver, it was easier to ignore the blinking Teams icon and focus on the technical task. This saved a lot of time switching between tasks.&nbsp;&nbsp;<\/li>\n\n\n\n<li>If running a piece of code took longer than a few seconds but less than a few minutes, I could fill that time brainstorming the next mini-task, explaining some tech-buddy things, or just cracking jokes while we waited.&nbsp;<\/li>\n\n\n\n<li>Since two people need breaks at different times, we took shorter breaks more often than I usually do. This is good for the eyes, posture, etc.&nbsp;<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"subsequent-sessions-reversing-roles\">Subsequent sessions: reversing roles<\/h3>\n\n\n\n<p>In subsequent sessions, we tried reversing driver- and shotgun roles and even changed roles multiple times during a single session. The sessions continued to be useful and we were able to discuss and immediately test a variety of hypotheses. In one four-hour session we managed to complete most of an entire list of tasks for a two-week sprint. This was only possible thanks to being able to continuously discuss the topic and next steps.&nbsp;<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"a-data-science-pair-session-with-tanel\">A data science pair session with Tanel&nbsp;<\/h3>\n\n\n\n<p>After working with Kaspar, veteran data scientist Tanel Peet and I paired up for a data science session.&nbsp;<\/p>\n\n\n\n<p>Although pair work is suggested for new people, our task was tricky and pairing up seemed useful. We\u2019d found that classical image-based tracking algorithms were not working well, so we decided to move to 3D ray tracking. Since Tanel had been working on tracking, and I had worked on localization, and we now needed to mix both algorithms, the knowledge from both sides proved invaluable.&nbsp;<\/p>\n\n\n\n<p>We started with a two-hour brainstorming session where we discussed most aspects of the problem and thought deeply about the underlying mathematics. We discovered that the natural data structure for the task was a set of undirected graphs, and the corresponding adjacency matrix (each isolated graph corresponds to a single track and tracking comes down to finding rules which allow deciding if there is an edge between two nodes or not). It would have been tricky to come to that conclusion alone, since the data structure of the original tracking algorithm was a bounding box (rectangle) on an image, and the data structure of localization was a list of 3D vectors. It requires quite a shakeup to think outside the (bounding) box, and the pair session did it for us. After the brainstorming session, we discovered that the actual implementation of the idea does not benefit too much from working on the same code so, instead, I worked on the implementation, and Tanel worked on visualization to check if the set of rules and thresholds we produced were working.&nbsp;<\/p>\n\n\n\n<p>This kind of separation is not how pair programming is supposed to work, but flexible rules served us well in this case. The visualizations were extremely useful for developing the tracking rules, and writing the rules was not difficult alone, once the natural data structure had been found during our brainstorm. By the evening, the algorithm was good enough that it made mistakes only in about 10% of cases (the previous tracking had around 20-30% mistakes) and around half of the mistakes in the prediction of the new algorithm ended up being mistakes in the ground-truth, instead. For me, this session proved that pair data science can be a useful concept not only for newcomers but for data scientists of all levels.&nbsp;<\/p>\n\n\n\n<p>Given what we learned, I\u2019d suggest another rule for the next sessions: <em>Take time to think whether the session is more productive than two people working solo. If it is not, then consider stopping the session.<\/em><\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"an-interview-with-newcomer-kaspar-hollo\">An interview with newcomer Kaspar Hollo&nbsp;<\/h2>\n\n\n\n<p><em>Proekspert&#8217;s agile and people coach <strong>Kadri Daljajev<\/strong> interviews onboardee Kaspar Hollo about his experience.<\/em><\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"kadri-did-you-have-any-previous-experience-with-pair-programming\"><strong>Kadri: Did you have any previous experience with pair programming?&nbsp;<\/strong><\/h4>\n\n\n\n<p>Kaspar: This was not the first time that I had used pair programming, since most of the projects during my master\u2019s studies my master\u2019s thesis projects were done in pairs, and professors encouraged us to try pair programming. I didn\u2019t see any advantages at first and had my doubts \u2013 I thought that it would only overcomplicate things and slow the process down. It turned out to be quite the opposite, especially when a deadline was approaching, and we had to push through. Pair programming forces people to talk through the problems, allows them to brainstorm solutions at a moment\u2019s notice, and it also helps to keep the pair from distractions. Frequent switching of the driver also allows to take minibreaks while maintaining a high tempo.&nbsp;<\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"how-did-you-find-the-experience\"><strong>How did you find the experience?&nbsp;<\/strong><\/h4>\n\n\n\n<p>I\u2019ve had a couple of interesting pair programming sessions at Proekspert with my tech buddy T\u00f5nis Laasfeld, some sessions live and some via Teams. While working with Teams the driver shared his screen, and that\u2019s it. The driver needs nothing more than advice and ideas from his shotgun position.&nbsp;<\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"what-was-most-beneficial\"><strong>What was most beneficial?&nbsp;<\/strong><\/h4>\n\n\n\n<p>Since I joined the team and company in the middle of an ongoing project, I would say that the pair programming sessions served first and foremost as a good introduction to the project code. I was able to ask questions without feeling like I\u2019m interrupting my tech buddy. Afterward, it was much easier for me to solve problems in that particular part of the code-base by myself.&nbsp;<\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"what-else-did-you-like-about-it\"><strong>What else did you like about it?&nbsp;<\/strong><\/h4>\n\n\n\n<p>I enjoyed the social aspect of the pair programming sessions. Even though I already knew my tech buddy T\u00f5nis from university, I would still say that I got to know him a little better \u2013 how he approaches problems, or how many good math\/programming related jokes can one person possibly make in a short period of time (he really pushed the boundaries on that one). Learning a couple of new tricks here and there was a real positive.&nbsp;&nbsp;<\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"how-did-you-prepare-for-the-sessions\"><strong>How did you prepare for the sessions?&nbsp;<\/strong><\/h4>\n\n\n\n<p>We defined very clear goals about what we wanted to achieve and what must be done to achieve them. This helped us keep track of progress and decide when to end each session. We also tried to keep the schedule relatively empty on these days so that there wouldn\u2019t be things that could potentially disturb the pair programming sessions.&nbsp;&nbsp;<\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"what-advice-would-you-give-to-the-next-newcomer\"><strong>What advice would you give to the next newcomer?&nbsp;<\/strong><\/h4>\n\n\n\n<p>Three things. One: Keep an open mind about pair programming. I also had my doubts in the beginning, but it managed to positively surprise me.&nbsp; Two: Ask a lot of questions. They might lead to better solutions or eliminate misconceptions. And three: Try to get to know the person you are programming with. The sessions offer a good opportunity for that.&nbsp;&nbsp;<\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"would-you-recommend-it\"><strong>Would you recommend it?&nbsp;<\/strong><\/h4>\n\n\n\n<p>I would say that pair programming is a useful method that helps kickstart onboarding for new employees.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<p><strong>Interested in joining Proekspert?&nbsp;<a href=\"https:\/\/clients.triloogia.ee\/proekspert\/wp-new\/join-us\/\">Check out our current vacancies here.<\/a><\/strong><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Pair programming is common in coding. But does it work in data science? And with remote work? Our data scientists gave it a whirl.  <\/p>\n","protected":false},"author":10,"featured_media":8342,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[3],"tags":[],"class_list":["post-8012","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-bi-data-analytics"],"acf":[],"_links":{"self":[{"href":"https:\/\/clients.triloogia.ee\/proekspert\/wp-new\/wp-json\/wp\/v2\/posts\/8012","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/clients.triloogia.ee\/proekspert\/wp-new\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/clients.triloogia.ee\/proekspert\/wp-new\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/clients.triloogia.ee\/proekspert\/wp-new\/wp-json\/wp\/v2\/users\/10"}],"replies":[{"embeddable":true,"href":"https:\/\/clients.triloogia.ee\/proekspert\/wp-new\/wp-json\/wp\/v2\/comments?post=8012"}],"version-history":[{"count":12,"href":"https:\/\/clients.triloogia.ee\/proekspert\/wp-new\/wp-json\/wp\/v2\/posts\/8012\/revisions"}],"predecessor-version":[{"id":12069,"href":"https:\/\/clients.triloogia.ee\/proekspert\/wp-new\/wp-json\/wp\/v2\/posts\/8012\/revisions\/12069"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/clients.triloogia.ee\/proekspert\/wp-new\/wp-json\/wp\/v2\/media\/8342"}],"wp:attachment":[{"href":"https:\/\/clients.triloogia.ee\/proekspert\/wp-new\/wp-json\/wp\/v2\/media?parent=8012"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/clients.triloogia.ee\/proekspert\/wp-new\/wp-json\/wp\/v2\/categories?post=8012"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/clients.triloogia.ee\/proekspert\/wp-new\/wp-json\/wp\/v2\/tags?post=8012"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}