Here we are, in phase 3 of the team refactoring project. Following part 2, it took quite some time for all the members of the team to create habits, adapt the automatic build on Jenkins and so on. Well in 6 months we changed so many things that we detected members who couldn’t keep up with this rhythm of change… It’s a part rarely mentioned in all the Agile Trend stuff but no your developers are not all on the same level ! So we actually had to stop for a moment, take some people apart to come back on certain practices, justify them the process, teach them and actually make them understand why it was better to do this way.
We, for example, had an all debate on the pros and cons of Test Driven Development Method and automation for deployment. We then did exercises (games) for them on their own part of the project and performed weekly reviews with them to see the evolution both of their work and their mindset. I think it’s a crucial part to embark the all team with you and for that you need to tailor your culture. You can move forward if only half of the team is paddling the right way … It took us almost a month to take back everybody on the same level and convince them that these changes were important, but it was worth it all the way !
The second challenge we faced was actually technical, around Jenkins. Once all the projects registered in Jenkins it took us LOT of time to configure it exactly how we wanted it… The problem was not the job configuration itself but actually external problems such as network latency, timeout in connections, wrong datasets in tests that referred to local data… We then had to modify our code to take into account all these challenges that popped out on Jenkins but not on the local Vagrant machine. It was quite a drawback in the process since we promoted that tests would ensure that the code could work no matter where 😛 In practice we had to add some timing procedure with wait() and sleep() especially on Selenium scenarios and also some complex scenario to validate datasets before performing tests. Basically our fixtures was far from complete and the fact to run it on newly created database everytime made us realize that we were lacking several master data to perform all the scenarios we wanted.
So yes, after some months Jenkins and tests are a win BUT it is a non negligible cost for the whole team both to configure all this pipeline to run smoothly and teach people. It took more time than expected but now that everybody agree and that all Jenkins are green (blue?) we’re ready to move to the next phase !
One click deployment
In the previous post ( Part 2 ) we talked about an internal tool dedicated to developers called “Butler”. As the name suggests it was created to answer any developer request regarding resources or project management. This tool has been deployed and is used internally to perform one click deployment !
Here is how it looks :
This is the homepage of the tool.
Developers log in using their Gmail business account so no new account needed.
For those interested the design was done using MaterializeCSS, a CSS framework based on the Google graphical project Materialize.
This project was actually a nice occasion to play with some new technologies. Internal projects like that are actually a great sandbox for developers to test things and usually motivates them a lot ! New technologies and make your life easier = dream project 🙂
We then have a page for a specific project which aggregates various information: documents templates for automated creation, Deploy Master in charge of this project, git URL etc…
From this page we can create a new release on a specific date, the tool will automatically create a issue on the tracking system, a release note document and tests checklist document to plan this release. We can also see the servers and the build listing entitled to this project.
If we select a server in the previous screen, we end up on a dedicated page about this specific server. As I said earlier we only operate with AWS Cloud Service , so we basically get all the info needed though the SDK provided by AWS.
Using the same SDK we can perform basic operation such as Start/Stop/Reboot/Snapshot. Only Production servers have a security lock on these actions, and only administrators can do it.
Last but not least, for each project we can do a deployment. This button will launch an SSH connection to launch the deploy recipe on the server. This part looks like a lot the cloud-init for the server and forces it to get the latest change of the recipe and resources.
Once the process done, the SSH output is displayed at the bottom of the page.
Of course this tool provides lot of feature but this one click deployment is the best by far. It aims at eradicate human error from the deployment process and is in the continuity of the DevOps mind shift operated in part 2.
Regarding the development of this project, it took 2 months for a single developer at 30% allocated to the project, so basically 0.75 man/month. So nothing crazy but the gain were tremendous ! It also allowed us to remove administrative access to developers on the AWS console which was a security risk and a human factor problem. At that time we were not required by any compliance policy but it still feels good to follow security good practices ! Everything goes through this tool, all access control are checked and logged which allows us to embed error checks and even fallback methods as well as better authorization for developers.
It s most likely that we will open-source this solution in a near future but I honestly am not proud of the code enough yet to present it to the world 🙂 The installation and configuration process are rather manual and I d like to make it less static before rolling it out. I’ll let you know when we publish it 🙂
A/B Testing conducive Architecture
Second major change is the integration of split testing into the codebase. Previously the company began split testing on a dedicated platform mainly for mocks-up and HTML design template, pretty much independent from the production website. The designers realized 2 to 5 mock-ups from a page, sent it on the platform for review and the winner was implemented onto the production environment.
We decided to step up the game by including split tests campaign directly into the production site using a PHP library. We select the group demographic using their profile on our platform and depending on their interests we display different variation of the website to study the impact. For now the results are handled through Google Analytics campaign code in order to see the effects of each variation.
Honestly so far the difference in behavior are barely noticeable and we’re far from mastering it. We end up with almost tie score 45% Vs 55%. However with the bigger updates we have in the pipe I m hoping to see clear results on what fits our users and how to give them the best experience possible and best gain for us.
Business Metrics Analysis
I often talk of the business actors as real stakeholders in the processes and product developed by our team. Since the dialog was open, the unilateral feedback we received is that… they were completely lost about what was happening in the product. Once I asked them to clarify they told us for example there no simple way to know how many users registered in the past week. For that, they had to ask directly a developer to write a SQL statement to run… on production database…
I know it looks like a joke but unfortunately it is not … And this is only a fragment of the kind of requests they were performing or that they wanted. They were curious about numbers of metrics, rate of profile filling and recurrence of access. This is critical to business actors and we failed them by provided only slow , on-demand analog process.
Following this discussion, it was decided as the highest priority to deliver them these metrics through a tool flexible enough to at least let them perform their query and be auto-updated with option. We searched quite a bit around the Business intelligence actors but honestly most of their price rate was way out of league…
And that’s when we discovered KlipFolio. Simple , beautiful, highly customizable. It could be linked to our CMS and database directly. In terms of process it s far from perfect because developers still have to write SQL queries but this time only once ! And they can add parameters to allow people to query what they want.
Once the project was announced and the training of the business stakeholders began, you could see on their face the joy to finally be able to do their work and have solid datasets to work on to prepare campaign, see evolution and so on ! I know I am a technical guy but I have to say, solving other people problem is the best part of the job, especially when you make the life of someone easier !
For now we’ll use Klipfolio as a first step into Business Metrics Analytics, we finally have the possibility to access, query and get metrics, but the bigger question is what metrics should we pull out. Not everything makes sense but that’s a discussion for another post 😉
Performance & Scaling
As I previously said our infrastructure is on AWS, however the current codebase , which is pretty legacy, prevents us from scaling the way we’d like. The code was entirely based on PHP5.4, images are stored locally on the disk using some spaghetti code you’d wish you never see and prevents us to scale horizontally with splitting the load between several machines. Indeed since the images are local, we would have to copy in real-time on all the instances every time a user upload or interact with an image…Therefore we have to upload in third party service/machine or uncouple the code from the local filesystem.
Furthermore beyond this scaling problem, we also have pure performance problems related to the code. For instance the connection to the database are not optimized at all, no lazy or eager loading and no cache either… The main backend legacy system is still using mysql_connect which also prevents us from simple upgrade to PHP5.5 which includes opcache and upper versions. We have to simplify the code logic itself , useless loops and recursions and lot of surprises brought by the legacy code…
The project is then identified , pretty clear and in order to scale and perform better we shall do the following:
- Get the images out of the instances, put them in S3 and use a Content Delivery Network such as CloudFront
- Rewrite all the legacy code that prevents us from upgrading to PHP 5.5 and upper, so basically wrapping all the mysql connections
- Optimize as much as possible the pages often viewed (Metrics got thanks to previous section) : homepage, search page and user space
UPDATE: Regarding the point 3, we could already optimize the code especially the connections to the database and gain 20 % of performance ! Before betting on cache and horizontal scaling, don’t forget to optimize your code !
Despite this legacy code, it is becoming unforgivable not to take into account smartphone and tablet users. The mission of the developers is then to refactor the code in order to provide new experiences for the users of these devices. The solution chosen was actually to migrate the current codebase to Responsive Design. With a system based on the famous CSS framework Bootstrap, we developed a grid system to match the need of the website. This choice was mainly done for its relative small impact on the codebase since we didn’t need to touch the underlying business logic.
What’s more this choice will reduce the maintenance costs since it automates certain effects, and prevents us to maintain X versions of the same website. It was also a good occasion to refresh the User Experience through a new interface integrating latest graphical theme and usability best practices.
Following the Code Retreat, acting upon it
Finally, it is been now 6 months that I organize with the help of my manager several code retreats internally in order to present new concepts, inform people through technical watch that people don’t always have time to do. This policy paid off since members looked into new technologies seriously (Codeception, Phpspec, ReactJS, MaterializeCSS, VueJS), but also the members participate much more !
For instance this week we changed the format of the code retreat and instead of one guy talking for one hour we did lightning talks from each member of the team about subject that interests them. It’s both a nice team building activity and also works a technology watch to broaden your mind and present you things you may not have searched by yourself.
Each member had to do a presentation around a common theme: testing. At first the theme was chosen by the manager to guide the exercise but the objective is clearly to be free and give power to the members to animate this kind of events.
I was really blown away by the maturity of team in this exercise. Despite the various character we have internally everybody took part of it. Those not comfortable talking simply showed the tests they put in place in the code and demonstrated their code. Others that handle their first project talked about the importance of TDD and the struggling to assess the time spent on it by the developers. Others such as designers clearly exposed that they didn’t get it in the current framework and the presentation was actually the result of their researches on the topic.
Beyond the performance of talking in front of everybody, the time spent on research empowered some of them to discover new things and shifted their point of view. It was a really interesting event that I’d love to reiterate in the future. What’s more we could detect several showmen that we’ll coach in order to make them speakers in conferences and preach the good practices. We just have to decide the recurrence of those kind of events but the concept is already validated by the team 🙂
After months of reflection, pros and cons debate it was decided to get rid of the troublesome legacy backend application and re-write from scratch. This is a fairly big project which will take us till the end of the year but it is finally launched ! It was the main showstopper in the evolution of the product and its scalability both vertically and horizontally. A new world is opening with us moving to PHP 5.6 and soon PHP7.1 to finally keep up with our technical debt, I can’t wait to be there!
We will also conduct a serious study on the application performance with the integration of Application Performance Monitoring tool such as NewRelic and BlackFire into the codebase of the code we will create.
The concept of split testing will be implemented further and the scope enlarged to other projects.
Next it goes beyond the technical scope but after some analysis, the results of this year was below our hopes. The sessions numbers is skyrocketing but the users actions unfortunately do not follow the same pattern. After several debate on the matter, we found one of the root causes : we lack a leader on the main website on the business side. Until now the developers decided by themselves their priority to implement. Since we just rolled out the business metrics it was basically luck… We affected Product Owner on all the new projects but the oldest and main project didn’t have a clear leader and feature roadmap… such a pity
I do believe this is actually the biggest challenge of our department currently because there is no business actors clearly identified and implicated in the product development. The vast majority of the features, improvements or evolution are created and implemented by us only. The business stakeholders only reports the results and we’re missing a feedback to close the virtuous loop of product development. Until now we mainly followed the conversion rate and sessions number only. We have no marketing direction, feature ideas or improvements proposed. The challenge for the company is then to find this business stakeholder to drive this evolution through the project and for us to be there to help her, give her the tools and give her the means to implement the vision.
I like to think that ones does not summarize developers to that but we should be a business tool, help the business to implement their ideas and vision on the project. Of course if they act as real peer it is even better but a business stakeholder is a MUST in any product team.
As always do not hesitate to ask me questions or comment with your own experience/vision on team refactoring !!