Sai Ram's Blog

thoughts, ramblings and ideas of a geek

Identifying Bad Tests

Look at this code Rails FactoryGirl definition for a regular user class

FactoryGirl.define do
  factory :user do
    first_name "John"
    last_name  "Doe"
    subscription_end_date Date.parse("2019-01-01")
  end
end

Does not look harmful. This is a base model. When your other spec files start using it, everything will work fine when you evaluate the subscription_end_date with other fields, but, exactly after 2019-01-01, on 2nd Jan 2019, the tests dependent on this value will start failing.

Anything that is hard coded when tested with relative time like, before/after (based on when the test is running) will start failing. Either use all relative dates or all absolute dates. Time.zone.now is relative unless you perform a timecop.freeze before hand.

Happy hacking!


How to Track User Agreement Acceptance in Database

Problem Statement

How would you model to check if a user has accepted your “Terms and Conditions” or “COPPA” when building your website? Lets talk how would you track in your data store (MySQL or Postgres or Mongo DB).

The front-end is represented by a checkbox or click of a button ‘I Accept’, this translates to either a true or false and gets stored into the a column in the users table in your database which is a boolean field.

The Implicit Requirement

Sounds fair. One thing you may have missed is that all websites like Microsoft, Github, Google, Dropbox tend to update their Policies. They usually send out an email and give the users time to review and accept theT&C which is usually applicable from a particular date. Now this is a requirement that is generally missed in a fine print since the stories create miss this for various reasons.

Respecting the Schema

If your Product Owner mentions this has to be done, you have updated your T&C, you want to be a good person and want to notify your users on the next sign on or not continue using the services or use older code base or mechanisms so that there is no service interruption. If you had a boolean field, this is one may tend to do.

  1. Run a cron to have everyone ‘UnAccept’ the T&C and update the new ’T&C’ data and try to log out users
  2. If you have a notifications module, you send a notification to the customers
  3. If you have a banner module, you may add it to the banner where you usually mention announcements or outage information

Banners and Notifications should be presented to only those who did not accept. but its hard to track this information. Since once you updated the column information, you lost precious information about the user. Also,

  1. We don’t want to show the banners for those who have accepted.
  2. Some people may have accepted from their Profile page
  3. You need to group users by who have not accepted the T&C versions (when you tend to modify them often and may have different settings for them instead of stopping to service them)

Elegant Schema Solution

The solution from this set of requirements would be to modify the boolean column in the database to use a CHAR or an integer (positive number) column with a NULLable or with a default value like 19700101 as the original date or a sequence of generated version numbers.

Every time a new T&C is updated, your code has the ability to compare the current version of the T&C with the one user has last read and accepted.

Other Advantages

This is good for analytical purposes as well and would allow user to accept T&C before hand itself instead of waiting till the switch is flipped after a deployment.

When you think from the User Experience point of view, it would be a blocking experience for the user to navigate and read the T&C and understand or discuss and accept. Notifying the user early and having the user accept early on would help them not worry on the day of your deployment.

Also, this approach helps your content/legal team deal with new T&C without developer intervention and unrelated to deployments.

One thing to note is that the T&C versions generated every time are an incremental sequence so that you can compare them.

This is a general advice for the type of data which should be accepted but the master reference can change over time.

Let me know what you think.


What Do SDE Levels mean?

When developers join a Product Technology company, they get confused by all the different types of roles in a company. I will start describing from the SDE job family. A job family has multiple levels. These levels are awarded to the employee based on their current skill. It usually corresponds to the salary as well.

Being knowledgeable about the levels at Amazon, Flipkart, I will consider Flipkart job levels since the company has been a golden standard in the Indian tech industry.

Software Development Engineer are usually at 5 levels, there is SDE I, SDE II, SDE III, Architect, Principal Architect. Its usually the company’s policy to keep salary ranges for each of the level separate as well as the bar. The bar indicates the level of the candidate according to various evaluations the company wants and how well the candidate can think.

Its a general intend for everyone to look at the money in the next level and try to jump fast to the next level along with the knowledge they can acquire by working with “Senior” folks in the team. This is the only way people feel like they learn. Teams or companies need to maintain a pyramid structure with more people at lower levels being mentored by higher levels. You see a range of people with different interests.

SDE One (SDE I)

When you learn the basics of Computer Science and graduate with some interest in programming, you usually get into an SDE-I role . If you have 2-3 years of experience in another company, you would be mostly considered for SDE I only.

The candidate is assumed to have basic knowledge of computers with intent to learn anything s/he is told and follow orders with reason and do them whether s/he likes them or not, but does them. Following SOPs (Standard Operating Procedures) is the bare minimum expectation, finding solutions from StackOverflow is perfectly tolerable as long as they understand what they actually copied and gets a code review which matches existing code styles.

During this phase, engineers usually start with bug fixes and writing test cases( yes, this is usually missed in most Universities). You are looking at classes or functions and probably confused why the classes are organised in a highly meaningless way.

One of the strengths should be Problem Solving. When a bug is reported, able to reproduce the bug, identify the bug with help of log messages or tools and understand what has caused it and work with a team member to fix it.

To grow to the next level, it requires lot of hard work along with understanding of domain knowledge, identifying problems, writing solutions, understanding design to an extent.

SDE Two (SDE II)

When you are within a company, you get opportunities to solve problems and contribute to showcase yourself and grow the knowledge. It may take upto 1-3 years to move up from SDE I to SDE II.

Getting interviewed from outside is different from growing within the company. You get one day (apart from 2-3 rounds of telephonic) with 3-5 rounds. The reason I feel Algorithms, Data Structures, Problem Solving and Coding is asked is that they cannot ask you show your old code or spend time for a week with the team and share their proprietary codebases. (There are few companies which actually do these too).

Coding

They see how you write code given a problem. They would slightly tweak the requirements to see how you’d end up modifying the existing code. Here is where your knowledge of Design Patterns come in. All the rest of clean code, test cases, separation of concerns, DRY(don’t repeat yourself), abstractions and assumptions. What usually people do wrong in an interview is that they start coding immediately, you need to clarify anything you feel is amiss in the requirement provided.

Problem Solving

Given a problem you need to understand what may fail and why. This is part of the code you write above, but when you are at SDE-II, you need to think about Non Functional Requirements(NFR) as well. You need to think about identifying bottlenecks, identify what caused an outage in your codebase what caused the outage and pinpoint the code region (at the least).

Breadth of Knowledge

Usually a bread of knowledge is required not just in your code, but other libraries you use, may be in the infrastructure components your team uses.

What is expected out of you?

At this level, you may be in the limbo state of multiple solutions with pros and cons for each, I tell you that is a normal state to be in.

Thinking about Application Security, SSL etc., Ownership, Auth

Roles and responsibilities include mentoring SDE-Is or similar level in other job families. This the toughest position to be in, since you are neither not-experienced nor highly experienced and its tough to decide to give a new project to you and too easy to decide to give a bug fix. This is how you get to become an SDE-III by learning patience, understanding existing problems and pitch to solve them. With all these problems you face, this may be a right time to show case leader ship strengths too.

SDE Three (SDE III)

This is a very tricky level to be interviewed for, this is where engineers are supposed to be mature in taking decisions since usually it takes 5-10 years to get to that stage of maturity, knowledge, depth, breadth in your knowledge of applying solutions, dealing with NFRs, problem solving, dealing with components other than your code.

This is the level where Databases are not a black box. I have had many phone interviews to screen candidates at this level. The standard problems they face are 1. Databases are black boxes 2. Networking is a black box 3. Data Replication across different clusters is taken care of 4. Horizontal scaling works, as long as there are other DevOps managing it 5. NoSQL is better than RDBMS 6. Ability to keep an open mind

I will go over each of the reasons behind each of these in a separate blog post on why they don’t work.

SDE III is apart from other things is about understanding about Distributed Systems. Its important since you live in a world of SOA. This is required badly when you think about building scalable architectures and one wrong step and the team goes ‘Boom!’.

Showcasing patience at this level is important since you are the decision maker and have to deal with other SDE-Is and SDE-IIs in your team when they come in for advice.

There is system design, low level design, gathering requirements, understanding things you did not know existing or cared about the previous week like saving costs by changing hardware, identify resource wastage, building systems which all teams in your company may use, able to present your opinion in the right way to showcase The Good, The Bad and The Evil of approaches, respecting lines, understanding positives and negatives of a framework, ability to differentiate between various programming languages, anticipating problems which you may hit and differentiating between short-term and long-term.

Thats for becoming an SDE-III. Get here and ask me for what it requires to be an Architect or you may have figured it out at the scale of dimensions people work on.


I have been given an advice. With influence/privilege you can be put at a higher level than you are at, but you may not survive that level because you may not understand what is happening. I have failed an internal exam to be put into IIT (in 2003), they separated IIT classes from regular classes. I was devastated. An administrative guide who worked at the college told me this advice that he has seen people put in IIT section, but had to get downgraded after trying for 3 months and wasted them and were stuck in a limbo state.

This is the same with levels at companies as well. If you think you are at a higher level, your boss or colleague (a mature one) can tell you why you are not. There may include some political reasons or financial reasons for not promoting you if you deserve. Startup type companies will give you roles even if they can’t afford the money along with recognition. We talk about political youth party leaders who do not deserve where they are right now. Try not to be that person in your team.


Generating Header Images for Blog

Inspired by Sense-Si, the process fills in text into a HTML page and uses a PhantomJS (on a docker container) and takes a screenshot. I wanted to see if I can make these in a simpler way with other tools I currently use already.

Trying out with Image Magick

First, lets download the image

wget -O header-plain.jpg "https://source.unsplash.com/featured/1200x630/?mountains"

single command but overflows

echo -n 'How to generate caption images with ImageMagick?' | composite -background rgba\(255,255,255,0.5\) -font BrushScriptI -pointsize 50 label:"@-" -channel rgba -alpha set  -gravity south -transparent-color white header-plain.jpg header-title1.jpg

But the text overflows when the image is of limited width

echo -n 'How to generate caption images with ImageMagick?'| convert -size 1200 -font BrushScriptI -pointsize 50 -gravity center -background rgba\(0,0,0,0.5\) -fill white caption:'@-' text.png
composite -channel rgba -alpha set -gravity south text.png header-plain.jpg header-title.jpg

This is how it looks with Image Magick

Trying with Cloudinary

Well, its kind of straight forward from the documentation to add text to an image.

Note: this is a dynamic image rendered off of Cloudinary servers, text can be passed in the URL. Text just needs to be encoded.

Trying with CSS

CSS 3 spec has become too powerful and once the layout is set, you need to take a screenshot from the generated style like in the first linked url.


Read Later Tools are Productivity Killers

First, lets take a count of your open tabs as well as ‘Pocket’ like tools which are marked ‘Read Later’.

Post the creation of Insta Paper, everyone wanted to read their information in a clean way, those who cannot tried to push it to their Kindles or Evernote. These people read a lot and wanted to read a lot. When you push to other devices, you may lose count on what you need to read.

What I have realized is that > Reading and Watching makes you do less Writing (be it code or language)

I was this person about an year ago till a friend talked me down. Its really really impossible to keep up in realtime about Javascript frameworks, Apple Rumors, what changed where and when its being released.

I have heard in podcasts, seen and read about people that

  1. have tens (even 50+) of open tabs and complain about how Chrome / Firefox is hogging the memory
  2. Some wish for the browser to crash so that they happily forget (eventually people reach this from above)
  3. Some keep their browser clean with pushing the stuff to read into Pocket like tools.

The Read Later tools are for whom who have lots of spare time on weekends and clear off their tabs and read list in a week or two. If you keep something for later, it gets missed and forgotten leading it into the virtual blackhole. This is a reality whether you like it or not. This happens a lot when you are researching.

For those interested in the math, 10s of HN new posts every four hours will pile up to 40 or so in a week (at the least), averaging to 8 on weekdays.

When you mark an article as ‘Read Later’, its telling the tool to remind you to forget about it

The Solution - An fixed size array

Write stuff you can’t remember and maintain a list of top 50 things you want to read online at any point of time in each section. If you want to read new stuff in Java or Go or Linux or about politics, spend the time then or add to your fixed size array, not a queue, not a stack, not a linked list. Just close it off once its done.

Keeps you 😴 better at night.