They were here for only one month, but what an impression our students left behind. We were thinking that we would have to cut the workload and then maybe we would have some kind of a product. But no, we got all our requirements, only making one compromise due to technology. And that's it, we've got what we wanted and other companies are already showing interest. As you can see: there progress was pretty constant, indicating no big issues in their performance.
But first, let's go back in time. Beginning of August they entered, we gave a two days workshop in Spark and Scala, nothing more. They were ready to go. During those two days, we told them that we were going to develop in an agile Way Of Work. There agenda was completely filled with meetings: planning, refinement, retrospective and daily stand-ups in one week sprints. At the end of the second day we had our first refinement. Later during the project one of them stated: "Goh, where did I end up? Giving points to all tasks, such a strict alignement, ..." But when she said it, she admitted that it helped quite a lot!
"Off you go!" we said at the end of those two days. Everything was to be decided by them. Up until architecture (Spark on AWS they had to use, but how they would solve all other problems, was entirely up to them). Reconsidering the project, this was the only thing they complained about: a bit more guidance would have helped them. But still we believe, that they would never have learned this much if we did not have given them this freedom.
Already after the first week, they were able to show some things. A basic front-end, uploading of files, ... Then they started of with automated type recognition, boxplot calculations, outlier detection, categorical or string comparisons and the hardest part of all: the LOF (local outlier factor) algorithm: the hardest part of the static outlier detection. The LOF algorithm is the reason why in their third week they were not able to perform better than the previous week. For the rest, the acted as normal but good team: first weeks a small overestimation, extra workload indicated during the sprints, a learning curve, converging to a constant progress, ...
Impressive how fast they did finish (except the performance enhancements of LOF). Half way the month, they could already start on the streaming algorithms. Only two parts were really extra: gap recognition and future checks if rows that were marked as outliers at that time, became non-outliers due to the extra information that now entered the system. But there it was: the single compromise we had to make... Spark 2.0 did not allow structured streaming yet, hence we are only streaming one column at a time, but all the algorithms are implemented. Hence well done!
Even an administrator view was finished during the last sprint. At InfoFarm we did put this in the backlog, thinking they would never be able to finish it. But then they would learn how to ask to skip some work. Unfortunately... No luckily they did not have to learn this!
In the retrospective of the sprint about the streaming algorithms we asked them to make a retrospective drawing. Everything what comes up to mind about the last weeks sprint should be on the drawing. You see the bad start of the week where they noticed that structured streaming was not available. But after the decision to go for a one-column stream, the sun began to shine and they deliverd. While one of our students was in her cave to further develop the LOF algorithm. On thursday the got pizza, since all InfoFarmers were working from our office and we all do like pizza.
And finally: not on the picutre, but worth to mention again. We did allow them to work from home one day a week. And yes, also for students this works. Very serious they did work. Not even mentionning the slack messages on Friday night or during the weekend. This was a group to be proud on. Therefor a last big thanks and hope to see you again!
And the application itself?
We're considering to release a test version. If you're interested, do not hesitate to drop us a mail at email@example.com