Morning Keynote by Ying Li

Software Engineer at Docker

Video is here.

TL;DR - Ying Li compared health care to software security. In health care, we don’t all need to be medical professionals but can prevent disease and death by being aware of risks and following simple safety guidelines. Likewise, we don’t all need to be NSA level hackers to create secure software. Follow simple checklists and be aware of threats and best practices being communicated by security professionals.

I get vaccinations and take first aid course because I want to contribute to the common good.

I get regular security updates, because I want to protect my private information as well as all my friends who have ever given me their personal information.

Everyone Security

You don’t have to be a medical professional to be a parent

I had never held a baby prior to my own

We keep checklists. Feed the baby, cloth the baby, take a nap

I also ask fellow parents when I don’t know what to do.

The baby is still alive!

We don’t all need to be doctors.

Basic security software is like this. You don’t have to be able to hack into a server to bypass Django security.

You don’t need to go to BlackHat or DefCon to install security updates.

Pretty scary to be responsible for something that you have no experience in. But with the right support system you can do it.

The Professor of 0’s (a children’s book)

Delores is a web dev at “House of Pops”. A stranger shows up… Gwen a good hacker. 0[day]Z House of Pops is vulnerable to server side template injection.

The brilliant children’s story goes on to semi-follow the plot line of the Wizard of Oz while discussing the process of maintaining a security culture in software developemnt. Quite entertaining!

Jet Brains and PSF Survey

26% of developers are web devs

Use frameworks or libraries that provide tooling for auto escaping input

Devs need to learn more about specific instances of these vulnerabilities

Redact and protect any sensitive information

OWASP

27% of devs focus on data scientists

Collected in an increasing large and increasingly vulnerable collection of data. Collectors of data are targets of phishing attacks.

Separate data collection, from data analysis.

Data that’s no longer needed for analysis should be discarded.

9% DevOps/Automation

Impossible to make a system impenetrable, but it is possible to make a system resiliant. Prevent lateral damage. Seperate networks. Each component should be provided minimal access only to what it needs.

3% of devs are in testing

Security is part of software quality.

Bandit, Python taint are good tools.

9% of Python users are focused on education

Bring up security early and often. Students should think about security as much as performance and concurrency. Formal threat modelling. System analysis is a good thing to do to think about formal threat modelling. Ask questions about all of the data in the system and how it gets there. What sort of priviledge is needed to access that data.

There are risks. Sometimes infants die for no apparent reason. Preventing this is of great interest to pediatritions. Safe sleep guidelines.

  • Back to sleep
  • Firm flat surface
  • No blankets

Since the campaign, SIDS rate has droped over 50%. 2500 fewer deaths / year since 1994. 30k lives saved since 1994. Back to sleep didn’t eliminate SIDS, but it’s a huge accomplishment for a 3 item checklist.

Software may not seem as important, but your software could be used in medical equipment or flight control, etc.

Cross Site Request Forgery

Can all make our applications safER, if not perfectly safe, by following basic checklists.

Probably administer our own machines. We write code.

Every one of us will care enough about security and data sensitivity that we continue it all throughout our work.

Qmisha Goss: The People & Python

Detroit Public Library

The Story

A librarian and bored children

The Idea: Coding for Kidz

“I don’t know how to code!”

“You’re young, you can learn it.”

Started using Tynker.com

Wanted 20 kids, got 26. Attendence trickled down. Why? “That’s baby stuff”. Dude, you’re 7!

So I got some raspberry Pis. Administration said, “Are they going to be hacking our computers?”

I went to the PyCon education summit. You should go to PyOhio, so I did. Went to PyCademy. First people in the US who are certified Raspberry Pi educator. Had to get myself to a level where I could teach these kids something that wasn’t baby stuff.

Teen tech week. We have been working on microbits. Coded to music. MineCraft Pi. We let them code in their MineCraft game. “Everything is flowing lava. Or everything is exploding TNT.” Now, we’re doing robot cars. My range was 6-17 year old kids. 6 year olds are not ready to learn Python. Jr version. Younger kids got to do robot cars.

Parkman Coders

Greenhouse project. Timelapse cameras and water moisture sensors.

Crowning acheivments. When we started to this year. 6 kids have been doing this for as long as I’ve been doing it.

Issues

Getting materials, budget, crazy error messages in Idle. “Why is it red?! Why? Why? Why?”

Sliding scale of issues. Issues running our program. The tech world issue.

Diversity in programming. Women and minorities are not unicorns. We are magical.

Respect for a person as a human being. Engaging that person. “Hi, how are you?” Ask them a question. Answer their question.

Value.

  • Respect
  • Engage
  • Value

If you can do that, you can retain people. But that’s a middle issue, not the top issue.

Poverty. Kids that come to my program, they are poor. They know they are poor. If they go home and can’t afford to do it, it hurts them. If I give them a RaspberryPi and they can’t afford to connect a keyboard, or a network connection, it hurts them.

Illiteracy. “Ms. Q: I want to play robots.” He couldn’t click or find anything. He couldn’t read, and didn’t know how to spell his name.

Hunger. If you have been hangry at 2-3, you know what it’s like. “Ms. Q. Can I have some snacks to take home?” Sidenote: this story resonated with many of us at the conference who were hangry because the Cleveland Convention Center kept running out of food at meal times.

Crime/Violence. “Ms. Q: Have you ever been to jain? What’s it like?” His cousin was going to jail and he wanted to know about it. It was just a normal conversation for him.

Lack of Resources & Adversity. Resiliance and Resourcefulness.

Innovation. Detroit is experiencing a revival. Around the library, kids were stealing lawnmowers, taking out the motors and putting them on bikes. Library allows you to use the computer for 1 hour. Kid figured out that you can apply for a library card on the Detroit website. So a kid applied for 20 library cards, so he could play robots with his friends for longer.

Exposure. Exposure to Python is exposure to the world. When you learn how to code, the first thing you do is to break someone else’s code. Because you’re empowered to use your superpowers for good or evil.

Consumer vs. Innovator. You have $1m dollars, how are you going to spend it. Everyone was the same. House, mansion, car, house for mom, iPhone, a few pairs of shoes. Your aspiration should not be to own and iPhone. Should be to create the next iPhone.

Greatness. They all have greatness in them, just need to cultivate it. Don’t be selfish. No one has ever become less great by helping someone else become great.

Support public libraries and educators

parkmancoders.org

Intuitive Augmented Reality Visualization

Innumeracy. Humans are bad about judging numbers.

There are 3 de-humanizing factors. So much time spending mental information comprehending the number, that we don’t put it in context. E.g. 15% US adults have no HS diploma. That’s not many! Oh, it’s 37 million people.

As of 2018, we have become factories for creating numbers. Every time we move, we create data, and we need to built data centers and computers to hold all of that data. But we have no idea how to interpret those numbers.

Augmented reality is a corrective tool for innumeracy. Not disruptive, spatial mapping.

Example 1 Big numbers, like really big

112 billion. I went to fortune 400 and found the top 4 highest earning individuals. Let’s create a data viz. Bar chart. It shows a relative volume. Doesn’t show the context. What does it represent. Bring to a scale that

To rebuild manhatten it would cost 630 billion. Now we can translate the weath of individuals, how many blocks of Manhatten can we build.

Tech stack

50% is fully in Python

  • CSV
  • pandas
  • unity
  • blunder (for 3d viz)

Oops! I committed My Password to GitHub!

Miguel Grinberg

blog.miguelgrinberg.com

github.com/miguelgrinberg

Software Dev at Rackspace

Author of the Flask Web Development book

Did you ever commit a password to source control?

The vast majority of people said yes! You are laughing, but my research says that you are doing this too. So keep laughing.

  • “Yeah, but it was by accident”

How? I wanted to test something that was going to go to the cloud. The application used cloud APIs to grab passwords to use. I typed in my personal password. But the test didn’t work. I spent the next couple of hours working on it, and forgot to undo it.

  • “Yeah, bit it’s fine because…”

Most common excuse, “… because it’s a private repo.” So because it’s private, only you and your team can access it? That logic does not work. A private repo is not encrypted. Lots of GitHub employees who can see it. You can see it from the inside. It’s not just you and your team. If one member of your team needs to be let go, they know your password.

The other excuse, “… all of the passwords in the code are for internal services”. I.e. you may have the password, but there’s nothing you can do with it. If you protect the perimiter, you can be lousy with your security inside. It’s bad logic. It’s not a good idea.

How to fix the mess after you’ve done it

  • Do not do this: make a new commit with the password removed
    • git keeps a list of the changes that you made. It’s all there.
  • Do not simply rebase the commit.
    • When you modify a commit, it doesn’t go away immediately
    • The original commit stays there in an orphaned state until it gets garbage collected
    • If someone fetched it before you fixed it, then they have your password on their local machine
  • That password is now out there
  • REVOKE the password

How to prevent it next time

  • USE ENVIRONMENT VARIABLES
    • password = os.environ['PASSWORD']
    • secret_key = os.environ.get('SECRET_KEY')
    • database_url = os.environ.get('DATABASE_URL', 'sqlite:///')
  • Adding secrets to the environment
    • .profile
    • .bashrc
    • other config files
    • .env files for your project (add it to .gitignore)
    • Do not type passwords in your shell!
      • They are stored by the shell in the history
      • There will be a copy of your password in the shell history
    • import dotenv import load_dotenv
      • finds a .env file in the current directory
      • adds any variables to the current environment

If the environment is not enough

  • The environment is not encrypted
    • fine if you’re working on a machine that only you own
  • Vault, Parameter Store (AWS), Secret Object (K8), Ansible Vault

Do’s and Don’ts

  • DO NOT write passwords into your code
  • Do import secrets from the environment
  • Do revoke any secrets that might have been compromised
  • DO NOT use services that don’t offer easy revocation
    • Must be immediate with a click of a button
  • DO NOT use the same password for more than one service
  • DO NOT use the same credentials for all users
    • If someone deletes a database by mistake, you can’t know who did it.
  • Use KeyPass for desktop passwords

Reinventing the Parser Generator

David Beazley

Video is here.

Programming is magic

Different levels of abstraction

names, functions, data, objects, the great beyond…

Magic methods are magic

Python gives you great flexibility for modifying the environment

“doonder methods”

Take it a step further and get into “Linguistic abstraction”. Write your own language. Make a language to match the problem to simplify the problem.

Let’s say you wrote your own programming language: how do you parse it? Non-trivial. (PLY).

  • Tokenize
  • Lexing
  • Parsing
  • Build an Abstract Syntax Tree

Go read the “Dragon Book”. Dense, mathmatical.

Most people turn to tools (lex and yacc in Unix). Tokenizers, parser-generators, code-generators.

Write out your grammer, run it through a parser generator, and you get a bunch of C code.

Python works this way too. Python’s parser is automatically generated by a file.

~/cpython/Grammer

The grammer is automatically turned into C code. cpython/Parser/pgen

Feed it a grammer and a header file

“Compiling Little Languages in Python” 7th International Python Conference, 1998. Horrible abuse of docstrings. The parser generator was the Python code (all in one!)

I copied that for PLY: Python Lex-Yacc. A mashup of Unix yacc and Spark

http://dabeaz.com/ply

PLY Example

PLY Predates

  • Interator protocol
  • New-style classes
  • Decorators
  • Generators

Other problems

  • Written in haste
  • Developed on a 200 Mhz machine

1 patch for 15 year old critical bug

Various projects have tried to make PLY respectable. I don’t want it to be respectable, I want it to die!

Magic Python

  • Metaclasses stil are magic
  • There are not types

SLY

See the video for the presentation. It was pretty impossible to take coherent notes here.

You’re an expoert. Here’s how to teach like one

Shannon Turner

Video is here.

Founder, Hear Me Code, 3500 students, 200+ teachers and TAs

Good, well-organized talk with a solid message. Somewhat dry, unfortunately.

Know your audience

  • Manage a Junior dev
  • Manage mid level devs
  • Teach workshops
  • Want to give a talk at PyCon

  • Connect with them
    • Why should they listen to you?
    • What can you offer them?
    • I often take attendence
  • What motivates them?
    • You can tailor your less if you konw that
    • Touch on the different motivations to all of them
  • Set expectations
    • What is the goal of the less?
    • What do you want students to take away?
  • Empathize with your audience
    • What is their level of understanding?
    • Don’t assume.
  • Don’t make anyone feel “less than”
    • Don’t make them feel like they are being talked down to
  • Remove distractions
    • Think about distractions in the broadest possible way
    • what might get in the way of learning
      • A terrrrrrrible lunch
      • Not being able to hear
      • Sexist jokes
      • Unexplained acronyms
  • Use live coding sparingly
    • Live coding can be difficult to follow and can derail your lesson plan
    • If there are no slides, there’s nothing to follow
    • Challenging to refer back to
  • Remember your journey
  • Smash jargon
  • Keep it conversational
    • Speak it as plainly as possible
    • “Hey Python, can you open this file?”

Examples that connect

  • Use examples that people already understand
    • We teach for loops when we take attendence
    • Day of the week/month of the year
  • Use specific examples
    • foo/bar/baz do not mean anything
  • Be practical
    • How will students be able to use this
    • Don’t just teach syntax, teach the context!
  • Why? So what?
    • When will I use this?
    • Clarify whether I should be teaching this (now, or at all)
  • One concept at a time
    • Really hard to learn more thing at once
    • This doesn’t just mean uncluttered slides
  • Model good behaviour
    • provide sample code
    • use meaningful variable names
  • Be flexible
    • Examples may connect with most people
    • But it’s rare that one size fits all
  • Mistakes happen
    • When you make a mistake, turn it into a teachable moment
    • Explain what happened and how you’re fixing it.
    • E.g. “NameError”.
  • Do it wrong on purpose
    • When students get errors, will they be prepared
    • E.g. single quote, double quote.
    • e.g. “My string which will not work’

Know what to hold back

  • Answer the question asked
    • Don’t add lots of extra information
    • This can be more confusing than enlightening
  • Nuance isn’t always helpful
    • Teach most common situations
    • They may never hit that painful corner case
  • Teach the pool, not the ocean
    • Know what is relevant in the moment and what is best saved for later
    • We learn to swim in the pool
  • Don’t cover too much material
    • People have limits
    • Is your lesson plan realistic
    • You will cover less ground than you think
    • Practice and cut

Fostering a love of learning

  • Reward curiosity
    • Children ask the best questions until it’s crushed out of them by tired adults
    • “I don’t know, let’s find out”
      • This is a magic phrase
      • It’s a humbling phrase
    • “Great question.”
      • “I love your instincts”
      • This means that the learner is extrapolating from what you’re talking about
      • Means you’re doing a good job!
  • Celebrate progress
    • Positive feedback encourages growth
    • Negativity stifles learning
  • Move about
    • Teaching concepts through movement makes for unforgetable experiences
    • Example: Everyone holding a letter of a string and people step forward when slice called out
    • People will never remember your slides
  • We learn by teaching
    • Teaching reinforces what we’ve learned
  • Your students learn by teaching, too
    • Teaching reinforces what we’ve learned
    • Create opportunities for students to teach others

shannonvturner.com/pycon

Dataclasses: The code generator to end all code generators

Raymond Hettinger

Video is here. Much like most of Raymond’s talk, it has a similar feel to a revival.

“Mutable named tuple with defaults”

Introductions

What does a code generator do for you? You give it specs and it writes code for you. If you’re specs are good, the result is good. If not, not so much. You need to think about your code generator because it is working on your behalf according to their world view.

Out of the box, if you use it in the simplest possible way, it works. But it grows with you.

  • Why?
    • Saves you time
    • Reduces wordiness
    • People learning classes, feel like there is lots of boilerplate
  • What are they for
    • Depends on who you ask
    • Some think data classes are primarily about data. Like a struct, or a holder of data.
    • Others think that we spend a lot of time writing classes and want the boiler plate go away. Let’s you focus on business logic. It is a class generator.
  • What to think about?
    • How to use a code generator?
    • What code does it write for you?
    • Is that the code you actually wanted? (Sometimes yes, sometimes, no)
      • This is an opinionated data structure
    • Is the time investment worth the time that it saved you
      • I train lots of people to program in Python
      • Imagine a world where you have written Python, but have never written __init__(), etc.
    • What is the impact on debugging
      • Yeah… we spend more time reading and debugging than writing.
      • Though any tool that introduces an abstraction automatically makes debugging worse (if it’s a leaky abstraction)

History

  • dicts, tuple, hand-written classes
  • NamedTuples
  • ORMs
  • 3rd party traitlets
  • type annotations
  • 3rd party library, attrs was inspiration for dataclasses

Comparision with Named Tuples

Near-zero learning code. Limited capabilities.

@dataclass
class Color:
    hue: int
    saturation: float
    lightness: float = 0.5

Uses a decorator instead of a subclass

__repr__ comes for free

Can access the fields by name asdict()able

astuple()able

You can assign a value to it.

Compare to NamedTuple

Private methods (well, not really… just there to prevent namespace collisions)

NamedTuples are unpackable directly.

Dataclasses are unhashable, non-iterable and uncomparable

Generated Code

Examples of the generated code being nicer than what people would have written otherwise. Whenever you say something about comparison, you need to say something aout hashability

Supports introspection of the class, but “looks junky”.

Uncommon cases

Freezing and Ordering

  • dataclasses are not orderable

    @dataclass(order=True, frozen=True)

In this case, you save a lot more work. You get all of the comparision operators by default

Freezing and Ordering

Customized Field Specifications

Closing Thoughts

How to write deployment-friendly applications

Hynek Schlawack

Video is here.

Make life easier, by doing less.

Use standard building blocks so it runs exacly the same regardles of where it’s deployed

Simple Pyramid hello world app

Make a WSGI application. How you do that depends on your framework.

Initialization code is notoriously hard to test. And should be extremely robust.

WSGI app is usally just a callable.

Use gunicorn, which allows you to call a function.

  • curl it
  • Get an apache log line

You could just use your single command line when you go to deploy, but then the dependency bleeds into your config.

Zoom out. You want a building block. Something that runs no matter what’s in it.

Shell script. Single point of truth. run-app.sh

exec 2>&1 gunicorn

Now your shell script is an adapter between application and environment. Works in docker, supervisor, systemd, heroku, cluster manager (K8), everything!

Now you have a black box, that is easy to run, which runs on localhost, logs on stdout. Don’t try to log to files anymore! For god’s sake, do not try to rotate them yourself. You know what’s going in and you know what’s coming out.

First problem: exposition - only localhost.

Configuration

  • Know the difference between the config of your application and the config of your general purpose software
  • What varies between deployments and environments
    • Almost nothing!
    • Logging does not!
      • Only two things, log level and the log format. You want human readable and machine parseable.
      • Take two configs, check them into your application and switch between them
  • What does vary?
    • External resources
    • Exposition, hostnames, ports
  • How?
    • ini file
    • Some options need to go to deployment and some need to go to application
    • ENVIRONMENT VARIABLES! (If only there were an easy way to pass key/value pairs between processes!)
  • What
    • direnv
    • envconsul
    • etcdenv
    • os.environ
    • envsubst
    • confd
    • consul-template
  • Move from environment variables to flies is not a problem
    • Easy to check in files into revision control and then just include the files that vary
  • env: HOST, PORT These are standard conventions. Use them
  • os.environ('LOG_LEVEL')
  • environ.config
    • you get lots of things: converters, nested prefixes, validation
  • New file that does the dirty work: wsgi.py
import environ
from .config import AppConfig
from .app_maker import make_app

app_cfg = environ.to_config(AppConfig)
application = make_app(app_cfg)
  • One thing I left out: certain things should not be in environment variables
  • Secrets, passwords, tokens
  • Environemnt variables can leak, and it can happen to you to
  • DON’T PUT SENSITIVE DATA INTO ENV VARIABLES
  • Solutions are platform specific so it gets hairy
  • Use the best thing for your platform
    • AWS Secrets module
    • HashCorp Vault
    • Google Cloud KMS
  • Use the built-in template capabilities: secrets.py
    • takes more leaks to lose a file than an env var
  • Write a wrapper, when you want to hide implementation details
    • If you switch backends, rewrite it, and you can keep the same interface throughout your code
  • Impossible to reload config into the application
    • So?
    • Zero downtime deployments
    • Just redeploy the process every day (or at whatever time period)
  • Handle SIGTERM
    • standard signal sent by all process managers and
    • Many Python … just put their event handling loop in a try/except blog and call it a day

“Even if you deploy on a prod server with git pull like an animal…”

Introspection

  • You need an API endpoint where you can reach inside your application
  • Readiness
    • Your app is ready to serve
  • Expose an endpoint that checks all of its resources and return 200/500
  • Google /heathz
  • Mozilla /heartbeat
  • Salt /-/ready
  • GitLab /-/readiness

In HAProxy you can block this whole namespace

  • Liveness
    • Really cheap
    • Available for process and cluster managers
    • Can tell wh
    • /-/heathy
    • /-/liveness
    • __lbheartbeat__

Moar

  • Otherways to introspect
    • __version__
    • /-/metrics
    • /-/log-level
    • You can use this for everything that you used to use Unix signals for
    • Lots of freedom and power
  • But now our stuff is distributed
    • We care about local state
    • Learn to love postgres
    • Cahing: redis/memcache
    • etcd/consul for config
  • Docker
    • Your black box just became blacker
    • Created an ecosystem
    • Applications are ephemeral
    • Your application knows how to start and communicate readiness
    • It knows how to serve and communicate health
    • It might now how to start
    • It’s automatically webscale ready

“Bees knees cluster orchestration du jour”

  • Loose coupling
  • Separate I/O & logic
  • Avoid global state
  • App boundary == just another boundary

This is an ideal. Not every application cannot fit this. Someone has to write to a disk at some point. Some of what I said conflicts with advice I gave last year. You have to break rules, you have to make tradeoffs, but you have to know the rules, and know the consequenses.

Epilogue

Don’t you wish the companies would show you how they do it? One real open source example. PyPI Serves 1.5 PB of data. 6 billion requests / month.

ox.cx/df

vrmd.de

Secrets of a WSGI master

Graham Dumpleton, the author and maintainer of Apache mod_wsgi.

Video is here.

My laptop ran out of battery this day, so I don’t have many notes.

“Friends don’t let friends use raw WSGI.”

Apparently he has been making tons of improvements to mod_wsgi (including auto-reload development mode) without a lot of fanfare. He recommends that people check it out again.



blog comments powered by Disqus

Published

12 May 2018

Category

work

Tags