PyCon Day 3: Morning Sessions
Eningeering Cultures
with Kate Heddleston
Environment has a huge affect on our behaviour, but we’re not willing to admit it. Abercrombie and Finch did a study that showed excited people buy more! You notice this if you ever go into their store at the mall. House music is playing… you want to buy things.
Criticism
Japanese motorcycle manufacturer did a study with the cycle giving you accurate constructive feedback/criticism. They found that when they did so, it make the driver drive even more aggressivily and crazily.
Criticism, even when intended to be constructive, is nearly always resented by the receiver.
Critical feedback in reviews:
- 88% to women (76% personal in nature)
- 22?% to men (2% personal in nature)
How do we level the playing field? Remove the criticism.
You might think that’s crazy… how do you give people feedback without giving them critical feedback. Instead, tell people what you want them to do. Be explicit about what you want them to do and the behaviour that you want them to exhibit.
E.g. Don’t make your Pull Requests so big vs. “Open smaller pull requests”.
Also, don’t make it personal. You’re at work, keep it related to work and professional.
Argument Cultures
What is an argument culture. People attack the weak parts of a position and intend that that will result in the strong parts surviving. In reality it makes us emphasize winning at all costs, which is usually at odds with finding the best solution or finding the shared truth. When the emphasis is on winning, people do crazy things and exhibit unethical behaviour. We see this in sports:
- doping
- attacking
- deflating footballs
And that’s in a regulated competition with officials and referees. Workplaces are essentially unregulated competition.
An example of undermining someone’s position: “You seem frustrated.” This makes it sound like someone is emotional and it undermines their rational credibility.
In order to get rid of unregulated, agressive arguments:
- Defer judgement; use a brainstorming session; the whole point is to get more ideas.
- Help people stay on topic.
- If two people seem to be on opposite sides of an argument, use a regulated environment.
Onboarding and Team Debt
In Software Engineering, we talk a lot about “technical debt”. If you’re shipping features really quickly, you might not be thinking about your architecture and the overall effect of your additions. If you do this long enough, it can topple your system.
Team debt is the same idea with people. When people aren’t onboarded properly, not everyone is operating at 100% of their capacity. If that goes on for long enough, then the more people you add, eventually it can topple your team.
Solution? On boarding.
What happens when there isn’t systematic on boarding? If there’s no structure to on boarding, then everything exists on the current social stucture. People have to rely on making friends, going to social events, wandering around and finding someone to get what they need.
Systematic on boarding should remove reliance on existing social structures. So that people who come on board who are different from the existing group, those people have just as high a chance of being successful at their jobs.
The NULL Process
This is no “formalized” process. The NULL process is where the process points to nothing meaningful. In a computer program, if NULL is used, it can crash the program. The NULL Process happens because people at the head of companies have experienced “bad process” and fear that any process is bad process. In reality, the NULL process is simply a subset of bad process.
The NULL process is proxy for unwritten rules that everyone should “just know”, but don’t. Companies don’t want to take the time to make the process.
NULL processes affect everyone, but disproportionately affect women and minorities.
Twitter just had a lawsuit filed against them. A female was denied promotion twice, despite excellent reviews and no critical feedback. She alleges that because they had no process, there was a “tap on the shoulder” culture that favored men over women.
Solution? There are a few things that you can do that are lightweight. Like… checklists!
- Collaborative
- Simple and easy to make
- lightweight
- reliable and powerful
- E.g
- How engineers respond to code reviews
- How code is deployed
- How people get promoted
- Checklists can be automatable
Questions
Q: NULL process
A: Agile management. High feedback system. Know what’s going on… get feedback from employees all the time and respond quickly.
Q: SE is not the only field that have lack of diversity. What makes SE different?
A: The Tyrany of Structurelessness. Discusses how when there is no explicit structure, there is implicit stucture. We’re just on the extreme side of things with respect to hating regulation.
Q: I like the idea of Agile Management. Following up, how do you ensure that you get the feedback that you need, especially if your enviornment is so toxic already?
A: Really negative responses to it are the biggest thing that squelch feedback. Acknowledgement is the biggest thing. Figure out the problem, figure out the root of it. If people know that they’ll be listened to and responded to, they’ll let you know what you need.
Q: Some places thrive on giving crtical feedback. What’s the best way to approach them?
A: Criticism is often for the purpose of boosting the egos of the people who are criticizing. I avoid those cultures, b/c it seems like there’s no way to turn them around.
Q: How do you deal with joining the team and having the responsibility of being the first person of “diverse background” and being expected to change the culture.
A: When I run into that, I say, [cheekily] “Yeah sure, make me a Director!”
Learning from other’s mistakes: Data-driven code analysis
with Andreas Dewes
Physicist
Our mission: help programmers to write good code
What is good code? If you ask the question, you always get 10 different answers. So, I try to take the lowest common demoniator.
If I were in an informerical, I’d have a trusty sidekick who ask me “How do we do this?” And I’d respond: “Just use the ‘Quality Master: 2000’”.
Tools and Techniques for ensuring code quality
In reality you have to use a variety of techniques, that span a two-dimensional space (manual to automated, and static (non-executed) to dynamic (evaluated code))
- Static Analysis
- Unit testing
- Manual code reviews
- Debugging/ profiling
Static analysis requires the least effort on the part of the programmer, so that’s what we’re focusing on
Static analysis
I’m going to write the correct version of a function, and then I’m going to introduce some bugs. (When I wrote it, I started with two bugs initially, so I didn’t even need to add any).
- I tried to iterate over a dictionary rather than dict.items()
- I tried to use real.imaginary instead of real.imag
Dynamic analysis (e.g. unit testing) would catch these, becaues the tests would fail.
Static Analysis (for humans). Read it and try to figure out what is going on. We could have found these bugs, but, we would have had to have some knowledge about what the attributes are actually named.
Static Analysis for computers is a huge research field.
- Compile the code into an AST
- Annotate it with additional information
- Parse the AST to find the problems
Tools
- PyLint
- PyFlakes
- Pep8 (performs both style checks and structural checks)
- … and many others
Limitations
- Checks are hard to create / modify
- E.g. PyLint code for analyzing try/except blocks
- Long feedback cycle
- E.g. if you find a false positive
Our approach
- Code is data!
- Make it easy to specify errors and bad code patterns
- Learn from user feedback and publicly available code
Step 1: Build the code graph (AST) Step 2: Annotate nodes with typing Step 3: Give each vertex a unique hash id that we can use to identify it (if a function is defined more than once, we have the same hash)
Example: Tornado
Advantages
- Easy detection of exact duplicates
- Semantic diffing of modules, classes, functions
- Semantic code search on the whole tree
- No more discussions about whitespace and line lengths! We can end this war!
Code issues = patterns in the graph
If we did the type annotation correctly, we will be able to detect that this is a complex number and will see that it doesn’t have that attribute. You can use YAML to describe the graph patterns:
node_type: attribute
value:
$type: complex
attr: imaginary
Another example: for loop without break statement
node_type: for
body:
$not:
$anywhere:
node_type: break
orelse:
$anything: ...
Or we can replace that for looking for a break. But loops can be nested. So we can extend it again and use an exclude
node that stops processing when it finds a for subtree.
You don’t have to write an SQL engine every time you want to write a new query. This way we have a good chance of people actually writing their own checks.
Summary
Crowd-sourcing code quality tools!
Questions
Q: How do we encourage developers to use static analysis on things outside of the code base
A: Ideally one giant AST that contains the Universe. But yeah, you can do it and use it on the dependencies. Just store the library in the graph as well and connect it to your own AST code. Vision is to connect as many of them as possible, so that we can see the relationship between them.
Q: Physics software, we are very concerned about memory problems. People throwing a massive amount of data at a fixed length array. Are there any checkss for that?
A: No. Pull Requests Accepted.
Technical Debt - The code Monster in Your Closet
with Nina Zakharenko
Slides area here
I’ve been a developer for 8 years with companies that span the gammut from multinationals to 5 person start-ups. I’ve seen this everywhere.
Technical debt results from “…a series of bad decisions, both technical and business.”
Eventually you find yourself “…using more resources to accomplish less.”
What decisions were made in the past that keep me from getting things done today.
What causes this?
Me. You.
If you’ve written a line of code, you’ve created some technical debt.
Examples of mistakes I made early on
Not seeing the value of Unit Tests (“Why should I? It works!”)
Not knowing how to say NO to features. When managers say that X, Y, Z are awesome! You need to think things through before just slamming them into the code base.
Being Overly Optimistic Estimates
Putting releases over good design and reusable code
“I’ve learned the error in my ways! I’m different now!”
Time crunch.
- The project was due yesterday! I’ll take a shortcut, and clean up the mess tomorrow.
- Who has said that (lots of people, raise their hands). Who has actually gone back and cleaned up (like 5% of those people raise their hands).
- Don’t find yourself in this position in the first place.
Unneeded Complexity
- LOC committed != amount of work accomplished
- Some people think that if they came up with a simple solution, then they might be missing something. Those people are usually wrong. Simple is good.
Lack of Understanding
- Step 1: Have a problem
- Step 2: Look up how to do it on StackExchange
- Step 3: Copy and Paste it into your codebase
- Step 4: ???
- Step 5: Bugs!
Result? Culture of Despaire
- This code base is already a heap of garbage
- Will anyone really notice if I toss another broken glass bottle to the top?
Red Flags
- Code Smells
- not (necessarily) bugs
- an indication of a deeper problem
- (all of the things that CodeClimate checks)
- half-implemented features (someone thought we needed it at the time, but we never have)
- no or poor documentation
- cowboy coders… no one else will ever look at this, I’m going to work here forever
- E.g. return ‘gmo white pesticide dough’
- commented out code, incorrect comments
- no tests or broken tests
- broken tests are even worse than no tests
- they create negative feedback… I don’t want to run the tests, I’ll get a big red arrow
- Architecture and design… smells
- parts of the code that no one wants to touch
- changing code in one area breaks other parts of the system
- severe outages caused by frequent
- Good desgin
- 80/20
- Bad design
- Functionality changes, variable names don’t
- Monkey Patching
- Doing this in testing is fine
- If your patching a 3rd party framework, do not do this!
- What exactly does that decorator do?
- They are very powerful
- They can be very confusing
- Check for undesired side effects
- Circular dependencies
- Don’t do imports inside of a method
Bad code and code smells are not technical debt, but they are part of a larger problem.
Case Studies
We still have applications that were running when JFK was President. ~IRS Chief
50 year old technology: “And we continue to use the COBOL programming language”.
This is not just the IRS:
- Banks and Finanacial Institutions
- Universities
- Air Traffic Control
Story Time
Used to work in finance. Banking systems were run in mainframes. Bankers were frustrated. They wanted a UI, cut-and-paste, technology that has been around since the 80s.
Idea! Write a fancy new front end. It will have “All the things”. Rewriting the back end is too expensive. Leave the mainframes, they’re too expensive.
So they brought in a bunch of high-paid architects and came up with a “great” system that used: “Cursors”.
Mainframe will output a text screen. Results are parsed by reading in the whole screen. Reading variables from the screen in certain positions.
Results? The new system was incredibly slow and error prone. You and to wait for a screen to be printed out and parsed to get any information. If the information was on column 81, you were hosed.
The bankers hated it. After months of work, the multi-million dollar project was scrapped. You can put lipstick on a pig, but it’s still a pig.
Another project: the MVP (minimum viable product). Get the product out to early customers as early as possible. It was a great idea, and it was successful. But the core project was created by a lone developer in a coffee fueled 48 hours. It was a great success. But there was a problem.
Years went on and the initial code and design didn’t go away. Instead it became the base for an expanding project, with expanding features. Technical debt just got swept under the rug.
Scope creep went along. The codebase was incredibly complex. More features than necessary. More working parts that went along. When a release was pushed, something was bound to break. That made it feel like your fault. It sucked for the developers and morale was very low.
Everything ground to a halt. The project was cancelled.
Sometimes you need to burn it. With fire! You might be surprised by how little time it can actually take to rebuild it.
Battling the monster
- Don’t point fingers
- Don’t blame Joe even if he left 4 years ago. You took on the project
- We all need to be part of the solution
- Work together
- PEP8 is a good place to start
- Figure out your code standards; write it down
- Pair programming
- It’s a lot easier to deal with problems when you have someone else helping you.
- Coding reviews
- Unless something is on fire, unreviewed code does not go into master.
Stay Accountable
- Unit and integration tests
- I consider myself a good programmer, but there’s been a lot of bugs that have been caugh
- Run the tests with a git pre-commit hook
- Use Continuous Integration (e.g. Travis)
Sell it to Management
By allocating some project time to tackling debt, the end result will be less error prone, easier to maintain and easier to add features.
Cost vs. Time
Argue the Ski Rental Problem
- You are going skiing for an unknown number of days.
- Cost $1 a day to rent of $10 to buy.
There’s another cost: People
- Hiring developers is hard.
- Technical debt frustrates developers.
- Frustrated developers are more likely to leave.
Some lingering technical debt is inevitable. Don’t be a perfectionist. Figure out the problem and deal wit it.
Always code as if the guy who ends up maintaining your code will be a violent psycopath who knows where you live. ~Martin Golding
It’s for your safety.
Paying down your debt
Prioritize: What causes the biggest and worst pain points
Shelf life: What is the life expectancy of this project? longer shelf life -> higher interest
Technical Debug can be strategic. If you don’t have to pay it off, you got something for nothing. If a system is getting decommissioned and you don’t have to pay it off, you just won. But this is rarely the case.
Refactoring: a systematic way of changing the code without changing the functionality. A clearer way of saying the same thing. Slow and steady wins the race. The end goal is to refactor, without breaking existing functionality (otherwise it’s a negative feedback cycle). At this point tests are absolutely necessary.
Making time: depends on the scope of the problem and the size of the team.
- Small (1-2 people): devote a week every 6-8 weeks
- Medium: devote a person per week and rotate
- Large: everyone devotes 10% of their time
Note that your time estimates should now take this into account.
Final Tips
Code is for humans, despite common misconception, code is not for computers. Make sure your variable names are human readable, leave comments, and be mindful about it.
Code is for humans to read!
Don’t Repeat Yourself (BUT)
If being DRY requires mind-bending backflips and abstractions, stop. Right now. This applies to all “best practices”. They may apply to you, but not all the time.
The Boy Scout Rule: “Always leave the campground cleaner than you found it.”
“Always check in a module cleaner than when you checked it out”
If you have a heaping pile of trash for a code base, pick up that broken bottle on top and recycle it.
Expect to be frustrated. You could be cleaning up days / months / years.
Don’t give up.
blog comments powered by Disqus