Code Reviews using Art Critique Principles

Stacy Morse

github.com/geekgirlbeta

codecrit.com

Code Reviews / Art Critiques

Produce the best results possible and teach students how to learn and to take criticism

Enter the Code Review Rage Quit

People love to flip desks. “We are not doing code reviews anymore”. We want to believe that we’re the best possible developers that we can be.

We don’t realize that when we stare at the same stuff over and over, you miss things.

I can fix this. It’s not a trap! Why do they always think it’s a trap.

I want to look at this with a unique perspective of my art background.

What is this methodology about? Art critiques have been around for hundreds of years. If you can’t talk about your work, or if you can’t talk about other people’s work, you don’t understand what’s going on.

If you can’t talk about your code, it’s highly likely that you have just gone to StackOverflow, cut and paste some jibberish without understanding what you’re doing. That’s bad.

We don’t make mistakes, we just have happy accidents [picture of Bob Ross].

Goals?

  • Happier devs
  • Improved morale
    • Don’t want to deal with processes that are abusive
  • Better code
  • Better code reviews
  • Less stress

Rules

Process! Without rules, there is chaos. If you know what is expected, you know what to expect.

  1. Respect
  2. Be prepared (don’t want to waste someone’s time). Code review is not time for mentorship. Code review is time to be productive.
  3. Stay neutral where being neutral is required. Leave negativity out of it.
  4. Constructive criticsm only.
  5. Stay on point and focused. You can’t read code if you’re doing other things.
  6. Mandatory participation. We all just want to talk about our own projects, but you have to contribute to everyone. The person doing the review has to read the code… and other understand it. A “thumbs up” is not a code review. This doesn’t give you any proof that the code was reviewed. The thumbs up doesn’t have any shared responsibility.

Only time you need to go over the rules again are when you on-board someone.

Pre-Review

  1. Identify your work.
  2. Run ALL relevant testing.
  3. Know your code!
  4. Check that the code meets all requirements.
  5. Do not wast the time of others with unprepared code.

Code Crit

Code Critique Stages

  1. Description
    1. Identify
    2. Inventory (proof that we have read it!)
      1. What is it? What format do I make it? Where does it go? You just want to make me do busy work.
      2. Write it however will work best for your company.
      3. What do I do with it? If there is something wrong with the code, send it back to the person who wrote it.
    3. Neutral
  2. Analysis
    1. New Code + Old Code
    2. Facts needed for critical interpretation (write down issues right now)
  3. Interpretation (Description + Analysis)
    1. What do all of these observations mean
    2. In a constructive manner, make suggestions
      1. Leave the ego at the door
      2. Acknowledge the negative
      3. Contribute positive solutions
      4. Not a compliment sandwhich.
      5. Be clear
      6. Provide solutions to fix the problem
      7. Use the notes from your inventory
      8. Contribute to a better environment
      9. No malice!
    3. When a person walks away from a code review beaten down, it lowers morale
  4. Judgement
    1. Can this code ship?
      1. Our code is important to us. It’s what we do. We should be paying attention to what we do.

Strategies to Edit Production Data

Julie Qiu, Engineering Lead for Spring (fashion ecommerce marketplace)

At some point, we all need to edit data in production. Many ways, some better than other. Processes that evolved in the company that I work for.

Why we edit production data

We’re all told that it’s bad practice. And then we go and do it anyway. Why? It turns out, there are legitimate reasons.

Internal tools are not available. I would have used another method, but it didn’t exist.

Edge cases exist. I tried to use the tool, but it didn’t work.

Time-sensitive changes are needed. We edit data in production because, it’s usually the fastest and easiest (albeit not the safest) method available.

I want to run this query:

UPDATE products SET name='julies-product' WHERE id=1

It’s Friday night. Marketing needs me to make this change. They promised that something was going to be done. All of the other engineers have left. I could Slack someone and have them check out my SQL, but… I don’t want to bother someone on Friday night. What’s the worst thing that can happen?

What I actually ran:

UPDATE products SET name='julies-product'

“My God. What have I done!”

We have real-life disaster stories just like this from all over the valley. They happen. And they happen not because we are bad engineers.

Safer strategies

  • Raw SQL spreadsheet
  • Local scripts
  • Existing server
  • Task runner
  • Script runner service

How it works. What’s great. What’s not great.

Develop a review process for manual edits

Just a process to make sure we get spot checks.

What we did was simply aintain a Google spreadsheet to record and review queries before running them

  1. Add a record to the spreadsheet
    • name date, description, reviewer
  2. Reviewer approves or requests changes
  3. Run the query
  4. Use a transaction

What’s great?

  • Easy to implement
  • Audit trail
  • Teaches people the right thing to do

What’s not great?

  • Easy to make mistakes
  • Audit trail is at will
  • Difficult to run long and complex logic

Local scripts

How it works?

  1. Write a script
  2. Run the script (but do a dry-run)

What’s great?

  • Reusable (with different arguments)
  • Easy to manipulate outputs
  • Access to common code (easy to import functions, reuse logic, etc)

What’s not great?

  • Easy to make mistakes
  • Logs are only available locally
  • Network disconnections
    • What do we do if the scripts take a really long time to run?
    • We have 50 million products in our database… what if we want to do something to all of them?
      • I used to come into work really early and start my script and wait all day
      • Would it be nice if we had a computer on all the time?! (Duh!)

Existing server

How it works?

  1. Write a script
  2. Get the script onto the server
  3. ssh and run inside a session

What’s great?

  • Ability to run long scripts
  • Reliable network connectivity (generally)
  • Infrastructure already exists

What’s not great?

  • Scripts can affect resources on that server
  • Not user friendly (deployment process can be a pain)
  • No persistent audit trail

Task runner

How it works?

  1. Write a script
  2. Code review and run tests
  3. Input arguments and run (from the Jenkins interface)

What’s great?

  • All the repetitive things are done once. Logging built in.
  • Persistent audit logs
  • Can watch the task run from the console
  • Code review and automated tests
  • User interface

What’s not great?

  • Hard to manage credentials
  • Environments are not clearly separated
    • Could easily accidentally use the production db URL but input a bunch of parameters from dev
  • Inputs are not verified

Script runner service

How it works?

User Interface -> Script Runner (dev) -> Database (dev)
               \> Script Runner (prod) -> Database (prod)

User just had to select an environment rather than input every argument into the script

What’s great?

  • Centralized config management
  • Separation of environments
  • User friendly interface

More that we can do

  • Parallelize and scale
  • Preview results
  • … up to you to customize

What’s not great?

  • Nothing. We’ve done it all

Which strategy is right for you?

Depends on your willingness to invest in infrastructure. Depends on your time available.

Make something usable, because engineers are also people!

Invest the effort, it’s worth the cost.

WHAT IS THIS MESS? (Writing tests for pre-existing code bases)

github.com/mrname/

I work for spinifexgroup. We write code that powers crazy contraptions.

What is Legacy Code?

Old? Not necessarily the case. Old != Legacy. Legacy code is:

  • Lacking tests
  • Lacking documentation (of every kind)
  • Difficult to read or reason about

If your code matches 1+ of these things, it is legacy.

Why shouldn’t we test code

  • It will be replaced eventually
    • NO IT WON’T… maybe it will, but not right now.
    • It was written to fill a critical business need. Otherwise they wouldn’t have spent the time to write it.
  • It has been working just fine
    • HAS IT?
    • Legacy code usually has poor logging or error handling, so there could be things going on that you don’t know about.
  • I do not have time
    • It takes more time and money to debug issues in the wild
  • I’m lazy
    • LAZY means being too lazy to fight fires, not being too lazy to write tests
  • What do I have to be the one to start?
    • Because you’re a good person.

Let’s do this

  • My tests will never cover this code base
  • One rule: test what you touch
    • Write a test that asserts current behaviour BEFORE YOU TOUCH CODE
    • Write a test that asserts new/modified behaviour and LET IT FAIL
    • Do code stuffs
    • Test passes
  • The first test is the hardest test

Choose Thy Weapons

  • Find a good test runner (pytest)
  • Use pre-existing tools for your framework (pytest-django)
  • pytest-socket to disable socket JUST IN CASE
  • pytest-cov to make
  • factory_boy to avoid boiler plate

Requirements file: .in (unpinned)

pytest.ini: django_settings, etc.

Assignemnt 1 (feature request)

Update “Quark” mode to conditionally determine the value of the “magic” property.

Everyone who knew what quark and magic has left the company.

Where did we go wrong? Nothing wrong with unit tests. But we should test the FUNCTIONALITY of legacy applications. Need to do integration testing?

Don’t go nuclear. When dealing with legacy code, there’s always a reason, even if that reason no longer exists.

Assignment 2: Dealing with code that touches the world

Get back a new array with crazy sauce on top of it.

Let’s read the code but do not touch

Where to start? Just try to get the class into a test harness. pytest-socket blocked e-mail connection. We can’t even instantiate this thing.

Breaking dependencies… what is this tangled web of madness

Subclass it. Override the smtp object and mock it. But we want to test the view. And it failss.

Mocking helps us get past the smpt problem.

Making the world a better place. Making docstrings and comments are the safest changes that you can ever make.

Hold up now… that thing shouldn’t be mutating data. But we could have broken things using that mutated argument. We should just fix the broken thing, Cannot change code without tests, canot write test without changing code.

Sometimes it makes sense to change code to make it more testable, but ONLY if we can do that without breaking the interface.

E.g. let’s pass in the smtp object.

Assignment 3

Adding a new feature to an unruly process. Add SMS reporting.

Create a legacy (the good time)

CI/CD

Pushing code should run a build that runs tests. Builds should fail if tests fail Builds should fail if test coverage drops

Keep testing! Start testing things that you don’t need to test Test Sprint Brain refreshment

No go forth and test things!

Adapting from Spark to Dask: what to expect

Irina Truong

Backend Engineer at Parse.ly

To Dask or not to Dask? You may be considering a journey…

  • Is it possible?
  • Is it worth it?
  • What about performance?

78 more lines of Java stack trace from exceeding memory limit

Job aborted due to stage failure

Driver stacktrack: (and then nothing)

No more java heap space.

I am here to help you set realistic expectation.

What is Spark?

  • General data processing engine
  • Java, Scala, Python, R
  • Batch processing, stream processing, SQL engine

What is Dask?

  • Parallel computing library for analytic computing
  • Python only
  • Written in Python, C and FORTRAN
  • Can run in Hadoop YARM, EC2, Mesos, Local cluster

So I had this Spark application…

Input Parquet record customer, url, referrer, session_id, timestamp

Aggregated JSON record Looks like a JSON dict. Special ID to index into ElasticSearch. How many people came from which referrers.

I wanted to use a real example b/c most Dask examples are abstract and dry.

Input data structure is partitioned into columns

Aggregate and transform

Round timestamps down to an hour

# Group by customer, url and rounded time period
COUNT_VALUES(referrer) (referrers) # this is a user defined aggregation function

I had to write it in scala, build a jar, give it to the Spark runner, register it with Python, etc., etc.

Aggregate with Dask

Dask uses the same syntax as Pandas

df['ts'] = df['ts'].dt.floor('1H')

Dask needs two custom aggregations

counter = dd.Aggregation(
    'counter',
lambda s: s.apply(counter_chunk),
lambda s: s.apply(counter_agg)
)

Two levels of hierarchy in the multilevel columns

Transformation:

Performance?

Benchmark on local laptop and YARN cluster (Amazon EMR)

Cluster deployment with Spark (AWS)

github.com/rq/rq and a uI to schedule jobs remotely

Cluster deployment with Dask (Kubernetes) Amazon, Elastic Container Services for Kubernetes (EKS) is in beta

Promises

  • No more Java heap space
  • Or container keilled by YARN
  • Meaningful stack traces
  • Real-time visual feedback
  • No profiling

Can see in real time how workers are doing and what their resource usage is.

The bad:

  • Custer deployments are WIP (dask-ec2, knit, skein, kubernetes)
  • Does not support SQL
  • Only simple aggregations (no collect_list or nunique)
  • Less docs and examples
  • Can’t write JSON
  • More bugs

The full code: < 250 LOCs

Slides are here

TODO

I wanted to go to the following talks this day, but had a scheduling conflict:



blog comments powered by Disqus

Published

11 May 2018

Category

work

Tags