API-ization of everything</a>. It seemed obvious then that software that talks to other software would be critical for building world-changing startups. What was less obvious then and more obvious now, is that those APIs would need to be connected to harness the full potential of everyday apps.</p><p>One of the best companies at connecting APIs is Zapier, which went through the YC Summer 2012 batch. Zapier, the leader in easy automation, makes it simple to automate workflows and move data across 5000+ apps. Setup takes less than six minutes and there’s not a single line of code. And with over 5,000 of the most popular B2B and consumer apps integrated, they’re already powering over 10 million integration possibilities.</p><p>I got a chance to sit down with Bryan Helmig (@<a href=https://www.ycombinator.com/"https://twitter.com/bryanhelmig?lang=en\%22>bryanhelmig), the CTO and founder of Zapier, to talk APIs, interoperability, and learn more about the company’s first-ever public API: Natural Language Actions (NLA). With this new API, they’re making it possible to plug integrations directly into your product, and it's optimized for LLMs.  </p><hr><p><strong>Bryan, thanks for joining me and finding time to catch up. Given the launch of your new Natural Language (NLA) API, I’m sure there was some insight or trend you were seeing that guided your build. </strong></p><p>Bryan: Absolutely. AI apps have become the fastest-growing category of apps on Zapier’s platform… ever. We're seeing a huge demand from our users and partner ecosystem, to plug AI and large language models into their existing tools, workflows and automation. And Zapier is well positioned to help – 81 billion workflow tasks have already been created on our platform. </p><p>We actually started by prototyping LLM products into our own tech stack. We had two previous product experiments before NLA. The first was a fully chat-based Zap setup flow. With current-generation models, this often felt like playing \"20 questions\" with the model – not a great user experience. But it made us realize that other developers were likely facing the same challenges, and that Zapier could really deliver a seamless and simple developer experience in a way that no other company could. </p><p>From there, we focused on how to wrap up and simplify each individual API endpoint you might find across Zapier's 20k+ actions. We then allowed the model to call each one as a separate “tool”. That was the fundamental design principle we used internally, and it helped us to expose this as the new NLA API – for any developer to add integrations into that products or internal tools in 5-10 minutes. </p><p><strong>For a team that’s the expert in APIs, launching Zapier’s first public API is a big deal. What about LLMs made this project different from how you’ve previously approached APIs in the past?</strong></p><p>Prior to LLMs, we never felt like we could deliver the magical developer experience that we wanted to. Under the hood, Zapier wraps up a ton of complexity from our ecosystem – our platform handles around 20 types of API auths, custom fields, versioning and migrations, arbitrary payload sizes, binary data. You name it. Making a Zapier API would have meant passing along all that complexity to our end users. </p><p>But now, AI and LLMs bring an interesting inflection point for Zapier: The new Natural Language Actions API abstracts all that complexity away from devs. In fact, the API has only one required parameter: \"instructions\". NLA can also be used in the more \"classic\" way by calling it hard-coded parameters instead of natural language parsing, but the natural language capabilities make it especially useful for people building products<em> on top</em> of LLMs. Ultimately we are using LLMs to make APIs easier to use for both humans and other LLMs!</p><p><strong>And what are some of the exciting things you’re seeing people build with your APIs?</strong></p><p><a href=https://www.ycombinator.com/"https://zapier.com/blog/how-a-contractor-uses-ai-to-write-business-emails//">There's this amazing story</a> about a contractor with dyslexia who teamed up with a client of his who happened to be familiar with Zapier. They built a Zap with OpenAI’s GPT-3 to write better business emails. It totally transformed his communication and even helped him land a massive $200,000 contract! It’s those stories of AI and automation coming together to help individual people that makes me excited to be building on this technology today.</p><p>But, really, we’re just scratching the surface. We can’t predict what all the builders on our Zapier platform will create. I mean, when we launched multi-step Zaps 5 years ago, we set a \"sanity\" limit of 30 [workflow] steps. We thought that would clearly be enough for anybody. But in less than 24 hours, users were inundating us to raise the limit. And as we dug in deeper, and found these beautiful, mind-blowing and complex Zaps – things we couldn’t have ever imagined. With LLMs in the mix, we’re hoping we’ll enable that same level of creativity and power, and now from the developer community. </p><p><strong>So with all of the power that LLMs bring to the table, can you share what’s actually happening under the hood? How have you kept it simple? </strong></p><p>At its core, we leverage OpenAI’s GPT3.5 series to understand and process natural language instructions from the user, map it to a specific API call, and return the response from the API – all in a way that’s optimized for LLMs.</p><p>First, users give explicit permission to the model to access certain actions. We try to make this super fast and simple, to feel like an OAuth flow to the end user. When a user is setting this up, they’re able to see what the required fields are and either let the AI guess or manually specify the values. Then once in a developer’s platform, the only required field for the user is the natural language instruction. We take that instruction from a user and let the model figure out how to fill in the required fields. The model then constructs an API call. </p><p>Before we can send the results back, we also need to make it LLM and human-readable. Many APIs return really complex data in their API responses that would not only cause an LLM to go over its token limit but it confuses both the model and the user. (As an example, a Gmail API call returns over 10,000 tokens!). We've done work on our end to trim down the results to expose just the relevant pieces. The NLA API currently guarantees arbitrary API payloads will fit into 350 tokens or fewer. This makes it incredibly easy to use and build on the NLA API without worrying about the data input or output with the APIs.</p><p><strong>And for any aspiring API developer reading this – either looking to use your new APIs or even building their own – any tips from the guys who live and breathe APIs all day?</strong></p><p>Definitely. The big thing many APIs \"get wrong\" is being overly complex, overly unique, and overly hard to get started. You’ve talked about how Stripe and Lob have gotten payments and shipping right by simplifying complexity; we leaned on similar examples for inspiration. If you’re building an API, you should too.</p><p>We're definitely big fans of libraries like <a href=https://www.ycombinator.com/"https://django-ninja.rest-framework.com//">django-ninja or <a href=https://www.ycombinator.com/"https://fastapi.tiangolo.com//">FastAPI for creating compelling APIs with baked-in types and documentation. We're using that sort of technology under the hood as well, both for design consistency and for scalability. </p><p>In the development of our NLA, we've tried to be strict about not letting internal complexity filter down to end developers. NLA supports both OAuth and API keys for quickly getting started, and we have several off-the-shelf examples in the <a href=https://www.ycombinator.com/"https://nla.zapier.com/api/v1/dynamic/docs/">API documentation</a>, including a published <a href=https://www.ycombinator.com/"https://blog.langchain.dev/langchain-zapier-nla//">LangChain integration</a>.</p><p>If you want to get started, any developer can <a href=https://www.ycombinator.com/"https://nla.zapier.com/get-started//">create an API key right away</a>. We’re excited to see what you can imagine, and please share – tag me on Twitter (<a href=https://www.ycombinator.com/"https://twitter.com/bryanhelmig?lang=en\%22>@bryanhelmig) and show me what you’ve got. And even better, I’d love feedback on what we’ve built, and we’re here to answer questions. And if you’re an API geek like the rest of us at Zapier… <a href=https://www.ycombinator.com/"https://zapier.com/jobs/">we’re hiring</a>.</p>","comment_id":"641a393150615b000151d84b","feature_image":"/blog/content/images/2023/03/og-NLA_tm3lap.png","featured":false,"visibility":"public","email_recipient_filter":"none","created_at":"2023-03-21T16:09:37.000-07:00","updated_at":"2023-03-22T08:59:00.000-07:00","published_at":"2023-03-22T08:59:00.000-07:00","custom_excerpt":"We sat down with Bryan Helmig, the CTO and founder of Zapier, to talk APIs and interoperability, and learn more about the company’s first-ever public API: Natural Language Actions (NLA).","codeinjection_head":null,"codeinjection_foot":null,"custom_template":null,"canonical_url":null,"authors":[{"id":"61fe29e3c7139e0001a710d2","name":"Garry Tan","slug":"garry","profile_image":"/blog/content/images/2023/03/Instagram-Image-Template--Square---21-.png","cover_image":null,"bio":"Garry is the President & CEO of Y Combinator. Previously, he was the co-founder & Managing Partner of Initialized Capital. Before that, he co-founded Posterous (YC S08) which was acquired by Twitter.","website":null,"location":null,"facebook":null,"twitter":"@garrytan","meta_title":null,"meta_description":null,"url":"https://ghost.prod.ycinside.com/author/garry/"}],"tags":[{"id":"62b9edfe063d2d0001f0fc58","name":"#442","slug":"hash-442","description":null,"feature_image":null,"visibility":"internal","og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"codeinjection_head":null,"codeinjection_foot":null,"canonical_url":null,"accent_color":null,"url":"https://ghost.prod.ycinside.com/404/"},{"id":"61fe29efc7139e0001a71196","name":"Technical","slug":"technical","description":null,"feature_image":null,"visibility":"public","og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"codeinjection_head":null,"codeinjection_foot":null,"canonical_url":null,"accent_color":null,"url":"https://ghost.prod.ycinside.com/tag/technical/"},{"id":"61fe29efc7139e0001a71175","name":"Interview","slug":"interview","description":null,"feature_image":null,"visibility":"public","og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"codeinjection_head":null,"codeinjection_foot":null,"canonical_url":null,"accent_color":null,"url":"https://ghost.prod.ycinside.com/tag/interview/"}],"primary_author":{"id":"61fe29e3c7139e0001a710d2","name":"Garry Tan","slug":"garry","profile_image":"https://ghost.prod.ycinside.com/content/images/2023/03/Instagram-Image-Template--Square---21-.png","cover_image":null,"bio":"Garry is the President & CEO of Y Combinator. Previously, he was the co-founder & Managing Partner of Initialized Capital. Before that, he co-founded Posterous (YC S08) which was acquired by Twitter.","website":null,"location":null,"facebook":null,"twitter":"@garrytan","meta_title":null,"meta_description":null,"url":"https://ghost.prod.ycinside.com/author/garry/"},"primary_tag":null,"url":"https://ghost.prod.ycinside.com/building-apis-for-ai-an-interview-with-zapiers-bryan-helmig/","excerpt":"We sat down with Bryan Helmig, the CTO and founder of Zapier, to talk APIs and interoperability, and learn more about the company’s first-ever public API: Natural Language Actions (NLA).","reading_time":5,"access":true,"og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"email_subject":null,"frontmatter":null,"feature_image_alt":null,"feature_image_caption":null},{"id":"6357f9044557ad0001018040","uuid":"b73507ea-8de6-4799-8305-1554bd33437c","title":"How to maintain engineering velocity as you scale","slug":"how-to-maintain-engineering-velocity-as-you-scale","html":"<p>Engineering is typically the function that grows fastest at a scaling startup. It requires a lot of attention to make sure the pace of execution does not slow and cultural issues do not emerge as you scale.</p><p>We’ve learned a lot about pace of execution in the past five years at Faire. When we launched in 2017, we were a team of five engineers. From the beginning, we built a simple but solid foundation that allowed us to maintain both velocity and quality. When we found product-market fit later that year and started bringing on lots of new customers, instead of spending engineering resources on re-architecturing our platform to scale, we were able to double down on product engineering to accelerate the growth. In this post, we discuss the guiding principles that allowed us to maintain our engineering velocity as we scaled.</p><h2 id=\"four-guiding-principles-to-maintaining-velocity\">Four guiding principles to maintaining velocity</h2><p>Faire’s engineering team grew from five to over 100 engineers in three years. Throughout this growth, we were able to sustain our pace of engineering execution by adhering to four important elements:</p><ol><li><a href=https://www.ycombinator.com/"https://www.ycombinator.com/blog/how-to-maintain-engineering-velocity-as-you-scale/#1-hire-the-best-engineers\">Hiring the best engineers</a></li><li><a href=https://www.ycombinator.com/"https://www.ycombinator.com/blog/how-to-maintain-engineering-velocity-as-you-scale/#2-build-a-solid-long-term-foundation-from-day-one\">Building solid long-term foundations from day one</a></li><li><a href=https://www.ycombinator.com/"https://www.ycombinator.com/blog/how-to-maintain-engineering-velocity-as-you-scale/#3-track-engineering-metrics-to-drive-decision-making\">Tracking metrics to guide decision-making</a></li><li><a href=https://www.ycombinator.com/"https://www.ycombinator.com/blog/how-to-maintain-engineering-velocity-as-you-scale/#4-keep-teams-small-and-independent\">Keeping teams small and independent</a></li></ol><h2 id=\"1-hire-the-best-engineers\">1. Hire the best engineers</h2><p>You want to hire the best early team that you can, as they’re going to be the people helping you scale and maintain velocity. And good people follow good people, helping you grow your team down the road.</p><p>This sounds obvious, but it’s tempting to get people in seats fast because you have a truckload of priorities and you’re often the only one doing engineering recruiting in those early years. What makes this even harder is you often have to play the long game to get the best engineers signed on. Your job is to build a case for why your company is <em>the</em> opportunity for them. </p><p>We had a few amazing engineers in mind we wanted to hire early on. I spent over a year doing coffee meetings with some of them. I used these meetings to get advice, but more importantly I was always giving them updates on our progress, vision, fundraising, and product releases. That created FOMO which eventually got them so excited about what was happening at Faire that they signed up for the ride.</p><p>While recruiting, I looked for key competencies that I thought were vital for our engineering team to be successful as we scaled. These were:</p><h3 id=\"a-experts-at-our-core-technology\">a. Experts at our core technology</h3><p>In early stages, you need to move extremely fast and you cannot afford to make mistakes. We wanted the best engineers who had previously built the components we needed so they knew where mistakes could happen, what to avoid, what to focus on, and more. For example, we built a complex payments infrastructure in a couple of weeks. That included integrating with multiple payment processors in order to charge debit/credit cards, process partial refunds, async retries, voiding canceled transactions, and linking bank accounts for ACH payouts. We had built similar infrastructure for the Cash App at Square and that experience allowed us to move extremely quickly while avoiding pitfalls.</p><h3 id=\"b-focused-on-delivering-value-to-customers\">b. Focused on delivering value to customers</h3><p>Faire’s mission is to empower entrepreneurs to chase their dreams. When hiring engineers, we looked for people who were amazing technically but also understood our business, were customer focused, were passionate about entrepreneurship—and understood how they needed to work. That is, they understood how to use technology to add value to customers and product, quickly and with quality. To test for this, I would ask questions like: “Give me examples of how you or your team impacted the<em> </em>business.” Their answers would show how well they understood their current company’s business and how engineering can impact customers and change a company’s top-line numbers.</p><p>I also learned a lot when I let them ask questions about Faire. I love when engineering candidates ask questions about how our business works, how we make money, what our market size is, etc. If they don't ask these kinds of questions, I ask them things like: “Do you understand how Faire works?” “Why is Faire good for retailers?” “How would you sell Faire to a brand?” After asking questions like these a few times, you’ll see patterns and be able to quickly identify engineers who are business-minded and customer-focused.</p><p>Another benefit of hiring customer-focused engineers is that it’s much easier to shut down projects, start new ones, and move people around, because everyone is focused on delivering value for the customer and not wedded to the products they helped build. During COVID, our customers saw enormous change, with in-person trade shows getting canceled and lockdowns impacting in-person foot traffic. We had to adapt quickly, which required us to stop certain initiatives and move our product and engineering teams to launch new ones, such as our own version of <a href=https://www.ycombinator.com/"https://blog.faire.com/thestorefront/introducing-faire-summer-market-our-first-online-trade-show-event//">online trade shows</a>.</p><h3 id=\"c-grit\">c. Grit</h3><p>When we first started, we couldn’t afford to build the most beautiful piece of engineering work. We had to be fast and agile. This is critical when you are pre-product-market fit. Our CEO Max and a few early employees would go to trade shows to present our product to customers, understand their needs, and learn what resonated with them. Max would call us with new ideas several times a day. It was paramount that our engineers were <a href=https://www.ycombinator.com/"https://angeladuckworth.com/grit-book//">gritty and able to quickly make changes to the product. Over the three or four days of a trade show, our team deployed changes nonstop to the platform. We experimented with offerings like:</p><ul><li>Free shipping on first orders</li><li>Buy now, pay later</li><li>Buy from a brand and get $100 off when you re-order from the same brand</li><li>Free returns</li></ul><p>By trying different value propositions in a short time, our engineering team helped us figure out what was most valuable to our customers. That was how we found strong product-market fit within six months of starting the company.</p><figure class=\"kg-card kg-image-card\"><img src=https://www.ycombinator.com/"https://lh3.googleusercontent.com/CrRDf25EV8if-oP6rfEnSYeA_ttfKsayeQoM61gMOYFODZvpYsId0z2Y5RQ8z5xH4zt8UQaPBOwe1xus8oaqKQW1zxqNxz_ss9LHTpWyCc6tWsyJUm6_g6lVUtb6PkHluwNcqIU9MN3silgCLqtNHO2S8RkPcQCHBYiVPhK9Fteoiq_w9dZJqaxTqA/" class=\"kg-image\" alt loading=\"lazy\"></figure><p><em>Our trade show storefront back when we were called Indigo Fair.</em></p><h2 id=\"2-build-a-solid-long-term-foundation-from-day-one\">2. Build a solid long-term foundation from day one</h2><p>The number one impediment to engineering velocity at scale is a lack of solid, consistent foundation. A simple but solid foundation will allow your team to keep building on top of it instead of having to throw away or re-architecture your base when hypergrowth starts.</p><p>To create a solid long-term foundation, you first need to get clear on what practices you believe are important for your engineering team to scale. For example, I remember speaking with senior engineers at other startups who were surprised we were writing tests and doing code reviews and that we had a code style guide from the very early days. But we couldn’t have operated well without these processes. When we started to grow fast and add lots of engineers, we were able to keep over 95% of the team focused on building features and adding value to our customers, increasing our growth. </p><p>Once you know what long-term foundations you want to build, you need to write it down. We were intentional about this from day one and documented it in our <a href=https://www.ycombinator.com/"https://craft.faire.com/handbook-89f166841ec9/">engineering handbook</a>. Today, every engineer is onboarded using this handbook.</p><p>The four foundational elements we decided on were:</p><h3 id=\"a-being-data-driven\">a. Being data-driven</h3><p>The most important thing is to build your data muscle early. We started doing this at 10 customers. At the time, the data wasn’t particularly useful; the more important thing was to start to collect it. At some point, you’ll need data to drive product decision-making. The longer you wait, the harder it is to embed into your team.</p><p>Here’s what I recommend you start doing as early as possible:</p><ul><li>Set up data pipelines that feed into a data warehouse.</li><li>Start collecting data on how people are using your product. As you add features and iterate, record how those changes are impacting user interactions. All of this should go into a data warehouse that is updated within minutes and made available to your team. As your product gets increasingly complex, it will become more and more important to use data to validate your intuition.</li><li>We use Redshift to store data. As user events are happening, our relational database (MySQL) replicates them in Redshift. Within minutes, the data is available for queries and reports.</li><li>Train your team to use experimentation frameworks.</li><li>Make it part of the product development process. The goal is to transform your intuition into a statistically testable statement. A good place to start is to establish principles and high-level steps for your team to follow when they run experiments. We’ve set principles around when to run experiments vs. when not to, that running rigorous experiments should be the default (and when it isn’t), and when to stop an experiment earlier than expected. We also have teams log experiments in a Notion dashboard.</li><li>The initial focus should be on what impact you think a feature will have and how to measure that change. As you’re scoping a feature, ask questions like: How are we going to validate that this feature is achieving intended goals? What events/data do we need to collect to support that? What reports are we going to build? Over time, these core principles will expand.</li><li>The entire team should be thinking about this, not just the engineers or data team. We reinforced the importance of data fluency by pushing employees to learn SQL, so that they could run their own queries and experience the data firsthand.</li><li>It’ll take you multiple reps to get this right. We still miss steps and fail to collect the right data. The sooner you get your team doing this, the easier it will be to teach it to new people and become better at it as an organization.</li></ul><h3 id=\"b-our-choice-of-programming-language-and-database\">b. Our choice of programming language and database</h3><p>When choosing a language and database, pick something you know best that is also scalable long-term.<strong> </strong>If you choose a language you don’t know well because it seems easier or faster to get started, you won’t foresee pitfalls and you’ll have to learn as you go. This is expensive and time-consuming. We started with Java as our backend programming language and MySQL as our relational database. In the early days, we were building two to three features per week and it took us a couple of weeks to build the framework we needed around MySQL. This was a big tradeoff that paid dividends later on.</p><h3 id=\"c-writing-tests-from-day-one\">c. Writing tests from day one</h3><p>Many startups think they can move faster by not writing tests; it’s the opposite. Tests help you avoid bugs and prevent legacy code at scale. They aren’t just validating the code you are writing now. They should be used to enforce, validate, and document requirements. Good tests protect your code from future changes as your codebase grows and features are added or changed. They also catch problems early and help avoid production bugs, saving you time and money. Code without tests becomes legacy very fast. Within months after untested code is written, no one will remember the exact requirements, edge cases, constraints, etc. If you don’t have tests to enforce these things, new engineers will be afraid of changing the code in case they break something or change an expected behavior.<br><br>There are two reasons why tests break when a developer is making code changes:</p><ul><li>Requirements change. In this case, we expect tests to break and they should be updated to validate and enforce the new requirements.</li><li>Behavior changes unexpectedly. For example, a bug was introduced and the test alerted us early in the development process.</li></ul><p>Every language has tools to measure and keep track of test coverage. I highly recommend introducing them early to track how much of your code is protected by tests. You don’t need to have 100% code coverage, but you should make sure that critical paths, important logic, edge cases, etc. are well tested. <a href=https://www.ycombinator.com/"https://leanylabs.com/blog/good-unit-tests//">Here are tips for writing good tests</a>.</p><h3 id=\"d-doing-code-reviews\">d. Doing code reviews</h3><p>We started doing code reviews when we hired our first engineer. Having another engineer review your code changes helps ensure quality, prevents mistakes, and shares good patterns. In other words, it’s a great learning tool for new and experienced engineers. Through code reviews, you are teaching your engineers patterns: what to avoid, why to do something, the features of languages you should and shouldn’t use. </p><p>Along with this, you should have a coding style guide. Coding guides help enforce consistency and quality on your engineering team. It doesn’t have to be complex. We use a tool that formats our code so our style guide is automatically enforced before a change can be merged. This leads to higher code quality, especially when teams are collaborating and other people are reviewing code.</p><p>We switched from Java to Kotlin in 2019 and we have a comprehensive style guide that includes recommendations and rules for programming in Kotlin. For anything not explicitly specified in our guide, we ask that engineers follow <a href=https://www.ycombinator.com/"https://kotlinlang.org/docs/coding-conventions.html/">JetBrains’ coding conventions</a>.</p><p>These are the code review best practices we share internally:</p><ul><li>#bekind when doing a code review. Use positive phrasing where possible (\"there might be a better way\" instead of \"this is terrible\"; \"how about we name this X?\" instead of \"naming this Y is bad\"). It's easy to unintentionally come across as critical, especially if you have a remote team.</li><li>Don't block changes from being merged if the issues are minor (e.g., a request for variable name change, indentation fixes). Instead, make the ask verbally. Only block merging if the request contains potentially dangerous changes that could cause issues or if there is an easier/safer way to accomplish the same.</li><li>When doing a code review, ensure that the code adheres to your style guide. When giving feedback, refer to the relevant sections in the style guide.</li><li>If the code review is large, consider checking out the branch locally and inspecting the changes in IntelliJ (Git tab on the bottom). It’s easier to have all of the navigation tools at hand.</li></ul><h2 id=\"3-track-engineering-metrics-to-drive-decision-making\">3. Track engineering metrics to drive decision-making</h2><p>Tracking metrics is imperative to maintaining engineering velocity. Without clear metrics, Faire would be in the dark about how our team is performing and where we should focus our efforts. We would have to rely on intuition and assumptions to guide what we should be prioritizing. </p><p>Examples of metrics we started tracking early (at around 20 engineers) included:</p><ul><li><strong>Uptime.</strong> One of the first metrics we tracked was <a href=https://www.ycombinator.com/"https://docs.datadoghq.com/integrations/uptime//">uptime. We started measuring this because we were receiving anecdotal reports of site stability issues. Once we started tracking it, we confirmed the anecdotal evidence and dedicated a few engineers to resolve the issue.</li><li><strong>CI wait time.</strong> Another metric that was really important was CI wait time (i.e., time for the build system to build/test pull requests). We were receiving anecdotal reports of long CI wait times for developers, confirmed it with data, and fixed the issue.</li></ul><figure class=\"kg-card kg-image-card\"><img src=https://www.ycombinator.com/"https://lh3.googleusercontent.com/KiE8tjsqkFvtJFmyY_6-IinXuT1A6C4x6JBUSX9qb9nDHB9lurJZAlHocGDEi3Sx_HSHNuBxozMBljGOsNokrQIJ9Hk6ZolI39yQtKPz0yuAbue0G2weaKWXqD65_Gbal_LYuEC5TpPoGIdCGd0jflhy1yRQzuG-pxV1IePbh8LuEtvqehC1gHs5lw/" class=\"kg-image\" alt loading=\"lazy\"></figure><p><em>This is a dashboard we created in the early days of Faire to track important engineering metrics. It was updated manually by collecting data from different sources. Today, we have more comprehensive dashboards that are fully automated.</em></p><p>Once our engineering team grew to 100+, our top-level metrics became more difficult to take action against. When metrics trended beyond concerning thresholds, we didn’t have a clear way to address them. Each team was busy with their own product roadmap, and it didn’t seem worthwhile to spin up new teams to address temporary needs. Additionally, many of the problems were large in scale and would have required a dedicated group of engineers. </p><p>We found that the best solution was to build <a href=https://www.ycombinator.com/"https://www.datadoghq.com/blog/the-power-of-tagged-metrics//">dimensions so that we could view metrics by team. Once we had metrics cut by team, we could set top-down expectations and priorities. We were happy to see that individual teams did a great job of taking ownership of and improving their metrics and, consequently, the company’s top-level metrics.</p><h4 id=\"an-example-transaction-run-duration\">An example: transaction run duration</h4><p>Coming out of our virtual trade show, <a href=https://www.ycombinator.com/"https://blog.faire.com/thestudio/faire-summer-market-2021-our-global-trade-show-event-is-coming-in-july//">Faire Summer Market</a>, we knew we needed significant investment in our database utilization. During the event, site usage pushed our database capacity to its limits and we realized we wouldn’t be able to handle similar events in the future.</p><p>In response, we created a metric of how long transactions were open every time our application interacted with the database. Each transaction was attributed to a specific team. We then had a visualization of the hottest areas of our application along with the teams responsible for those areas. We asked each team to set a goal during our planning process to reduce their database usage by 20% over a three-month period. The aggregate results were staggering. Six months later, before our next event—<a href=https://www.ycombinator.com/"https://blog.faire.com/thestorefront/announcing-faires-2022-winter-virtual-trade-show-events//">Faire Winter Market</a>—incoming traffic was 1.6x higher, but we were nowhere close to maxing out our database capacity. Now, each team is responsible for monitoring their database utilization and ensuring it doesn’t trend in the wrong direction.</p><h3 id=\"managing-metrics-with-kpi-scorecards\">Managing metrics with KPI scorecards</h3><p>We’re moving towards a model where each team maintains a set of key performance indicators (KPIs) that get published as a scorecard reflecting how successful the team is at maintaining its product areas and the parts of the tech stack it owns.</p><p>We’re starting with a top-level scorecard for the whole engineering team that tracks our highest-level KPIs (e.g., <a href=https://www.ycombinator.com/"https://docs.datadoghq.com/tracing/guide/configure_an_apdex_for_your_traces_with_datadog_apm//">Apdex, database utilization, CI wait time, severe bug escapes, flaky tests). Each team maintains a scorecard with its assigned top-level KPIs as well as domain-specific KPIs. As teams grow and split into sub-teams, the scorecards follow the same path recursively. Engineering leaders managing multiple teams use these scorecards to gauge the relative success of their teams and to better understand where they should be focusing their own time.</p><p>Scorecard generation should be as automated and as simple as possible so that it becomes a regular practice. If your process requires a lot of manual effort, you’re likely going to have trouble committing to it on a regular cadence. Many of our metrics start in DataDog; we use their API to extract relevant metrics and push them into Redshift and then visualize them in Mode reports.</p><p>As we’ve rolled this process out, we’ve identified criteria for what makes a great engineering KPI:</p><ul><li><strong>Can be measured and has a believable source of truth.</strong> If capturing and viewing KPIs is not an easy and repeatable task, it’s bound to stop happening. Invest in the infrastructure to reliably capture KPIs in a format that can be easily queried.</li><li><strong>Clearly ladders up to a top-level business metric.</strong> If there isn’t a clear connection to a top-level business metric, you’ll have a hard time convincing stakeholders to take action based on the data. For example, we’ve started tracking pager volume for our critical services: High pager volume contributes to tired and distracted engineers which leads to less code output, which leads to fewer features delivered, which ultimately means less customer value.</li><li><strong>Is independent of other KPIs.</strong> When viewing and sharing KPIs, give appropriate relative weight to each one depending on your priorities. If you’re showing two highly correlated KPIs (e.g., cycle time and PR throughput), then you’re not leaving room for something that’s less correlated (e.g., uptime). You might want to capture some correlated KPIs so that you can quickly diagnose a worrying trend, but you should present non-duplicative KPIs when crafting the overall scorecard that you share with stakeholders.</li><li><strong>Is normalized in a meaningful way.</strong> Looking at absolute numbers can be misleading in a high-growth environment, which makes it hard to compare performance across teams. For example, we initially tracked growth of overall infrastructure cost. The numbers more than doubled every year, which was concerning. When we later normalized this KPI by the amount of revenue a product was producing, we observed the KPI was flat over time. Now we have a clear KPI of “amount spent on infrastructure to generate $1 in revenue.” This resulted in us being comfortable with our rate of spend, whereas previously we were considering staffing a team to address growing infrastructure costs.</li></ul><p>We plan to keep investing in this area as we grow. KPIs allow us to work and build with confidence, knowing that we’re focusing on the right problems to continue serving our customers.</p><h2 id=\"4-keep-teams-small-and-independent\">4. Keep teams small and independent</h2><p>When we were a company of 25 employees, we had a single engineering team. Eventually, we split into two teams in order to prioritize multiple areas simultaneously and ship faster. When you split into multiple teams, things can break because people lose context. To navigate this, we developed a pod structure to ensure that every team was able to operate independently but with all the context and resources they needed. </p><p>When you first create a pod structure, here are some rules of thumb:</p><ul><li><strong>Pods should operate like small startups.</strong> Give them a mission, goals, and the resources they need. It’s up to them to figure out the strategy to achieve those goals. Pods at Faire typically do an in-person offsite to brainstorm ideas and come up with a prioritized roadmap and expected business results, which they then present for feedback and approval.</li><li><strong><strong><strong>Each pod should have no more than 8 to 10 employees. </strong></strong></strong>For us, pods generally include 5 to 7 engineers (including an engineering manager), a product manager, a designer, and a data scientist.</li><li><strong>Each pod should have a clear leader. </strong>We have an engineering manager and a product manager co-lead each pod. We designed it this way to give engineering a voice and more ownership in the planning process.</li><li><strong>Expect people to be members of multiple pods. </strong>While this isn’t ideal, there isn’t any other way to do it early on. Resources are constrained, and you need a combination of seasoned employees and new hires on each pod (otherwise they’ll lack context). Pick one or two people who have lots of context to seed the pod, then add new members. When we first did this, pods shared backend engineers, designers, and data analysts, and had their own product manager and frontend engineer.</li><li><strong>If you only have one product, assign a pod to each well-defined part of the product.</strong> If there’s not an obvious way to split up your product surface area, try to break it out into large features and assign a pod to each.</li><li><strong><strong><strong>Keep reporting lines and performance management within functional teams. </strong></strong></strong>This makes it easier to maintain:</li></ul><p>\t\t(1) Standardized tooling/processes across the engineering team and balanced \t\tleadership between functions</p><p>\t\t(2) Standardized career frameworks and performance calibration. We give our \t\tmanagers guidance and tools to make sure this is happening. For example, I \t\thave a spreadsheet for every manager that I expect them to update on a \t \t\tmonthly basis with a scorecard and brief summary of their direct reports’ \t\t \t\tperformance.</p><h3 id=\"how-we-stay-on-top-of-resource-allocation-census-and-horsepower\">How we stay on top of resource allocation: Census and Horsepower</h3><p>Our engineering priorities change often. We need to be able to move engineers around and create, merge, split, or sunset pods. In order to keep track of who is on which team—taking into account where that person is located, their skill set, tenure at the company, and more—we built a tool called Census.</p><p>Census is a real-time visualization of our team’s structure. It automatically updates with data from our ATS and HR system. The visual aspect is crucial and makes it easier for leadership to make decisions around resource allocation and pod changes as priorities shift. Alongside Census, we also built an algorithm to evaluate the “horsepower” of a pod. If horsepower is showing up as yellow or red, that pod either needs more senior engineers, has a disproportionate number of new employees, or both.</p><figure class=\"kg-card kg-image-card kg-card-hascaption\"><img src=https://www.ycombinator.com/"https://lh3.googleusercontent.com/pJk7SUqsmeQLU_dYU3BrN5wMnzyHwVySmydpuiNbHgDddt_FzwwQzCQ_pQH75FX-InduoRGg5hSVhcfXZxRC3FztBZ3aF_2JnwIFMBOhjSey2cgRQEqs38oORhgZgrtwrmgO7CM-WSU_34oeyp15hdzHOrH_FAXTlFlJOt-A87J4Brce_ri3MER8RA/" class=\"kg-image\" alt loading=\"lazy\"><figcaption>.</figcaption></figure><p><em>Census.</em></p><figure class=\"kg-card kg-image-card\"><img src=https://www.ycombinator.com/"https://lh3.googleusercontent.com/N7btbx4GDkomhZp8wj0CMlTiGywqDffV6qCakK6aZEILScjRiIqjhwjV1q2AlT6bmrzU9vqo_pa1ggXn8j_C0CWsO4BEQdHoq5EcPfOhZwhe8tg1oMmmmDeYQXNrjF99WOdM5AKVTT5GAisZM_idtecOsjdXH_qQ2ezvEVRLltbkMfmk1j3qouwt7g/" class=\"kg-image\" alt loading=\"lazy\"></figure><p><em>Pods are colored either green, yellow, or red depending on their horsepower.</em><br><br>One of the most common questions that founders have is how to balance speed with everything else: product quality, architecture debt, team culture. Too often, startups stall out and sacrifice their early momentum in order to correct technical debt. In building Faire, we set out to both establish a unified foundation <em>and</em> continue shipping fast. These four guiding principles are how we did it, and I hope they help others do the same.</p>","comment_id":"6357f9044557ad0001018040","feature_image":"/blog/content/images/2022/10/BlogTwitter-Image-Template-2.jpeg","featured":true,"visibility":"public","email_recipient_filter":"none","created_at":"2022-10-25T07:56:04.000-07:00","updated_at":"2022-10-26T12:38:29.000-07:00","published_at":"2022-10-25T09:00:00.000-07:00","custom_excerpt":"Faire’s engineering team grew from five to over 100 engineers in three years. Throughout this growth, we were able to sustain our pace of engineering execution by adhering to four guiding principles.","codeinjection_head":null,"codeinjection_foot":null,"custom_template":null,"canonical_url":null,"authors":[{"id":"61fe29e3c7139e0001a710d4","name":"Marcelo Cortes","slug":"marcelo-cortes","profile_image":"/blog/content/images/2022/10/Instagram-Image-Template--Square---7-.jpg","cover_image":null,"bio":"Marcelo Cortes is a co-founder and the CTO of Faire, an online wholesale marketplace connecting mostly small brands to independent, local retailers.","website":null,"location":null,"facebook":null,"twitter":null,"meta_title":null,"meta_description":null,"url":"https://ghost.prod.ycinside.com/author/marcelo-cortes/"}],"tags":[{"id":"61fe29efc7139e0001a7116d","name":"Essay","slug":"essay","description":null,"feature_image":null,"visibility":"public","og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"codeinjection_head":null,"codeinjection_foot":null,"canonical_url":null,"accent_color":null,"url":"https://ghost.prod.ycinside.com/tag/essay/"},{"id":"61fe29efc7139e0001a71181","name":"YC Continuity","slug":"yc-continuity","description":null,"feature_image":null,"visibility":"public","og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"codeinjection_head":null,"codeinjection_foot":null,"canonical_url":null,"accent_color":null,"url":"https://ghost.prod.ycinside.com/tag/yc-continuity/"},{"id":"61fe29efc7139e0001a71196","name":"Technical","slug":"technical","description":null,"feature_image":null,"visibility":"public","og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"codeinjection_head":null,"codeinjection_foot":null,"canonical_url":null,"accent_color":null,"url":"https://ghost.prod.ycinside.com/tag/technical/"},{"id":"61fe29efc7139e0001a71170","name":"Startups","slug":"startups","description":null,"feature_image":null,"visibility":"public","og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"codeinjection_head":null,"codeinjection_foot":null,"canonical_url":null,"accent_color":null,"url":"https://ghost.prod.ycinside.com/tag/startups/"},{"id":"61fe29efc7139e0001a71158","name":"Leadership","slug":"leadership","description":null,"feature_image":null,"visibility":"public","og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"codeinjection_head":null,"codeinjection_foot":null,"canonical_url":null,"accent_color":null,"url":"https://ghost.prod.ycinside.com/tag/leadership/"},{"id":"61fe29efc7139e0001a7114c","name":"Company Building","slug":"company-building","description":null,"feature_image":null,"visibility":"public","og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"codeinjection_head":null,"codeinjection_foot":null,"canonical_url":null,"accent_color":null,"url":"https://ghost.prod.ycinside.com/tag/company-building/"},{"id":"62d804e33644180001d72a1f","name":"#1543","slug":"hash-1543","description":null,"feature_image":null,"visibility":"internal","og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"codeinjection_head":null,"codeinjection_foot":null,"canonical_url":null,"accent_color":null,"url":"https://ghost.prod.ycinside.com/404/"},{"id":"61fe29efc7139e0001a71155","name":"Growth","slug":"growth","description":null,"feature_image":null,"visibility":"public","og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"codeinjection_head":null,"codeinjection_foot":null,"canonical_url":null,"accent_color":null,"url":"https://ghost.prod.ycinside.com/tag/growth/"}],"primary_author":{"id":"61fe29e3c7139e0001a710d4","name":"Marcelo Cortes","slug":"marcelo-cortes","profile_image":"https://ghost.prod.ycinside.com/content/images/2022/10/Instagram-Image-Template--Square---7-.jpg","cover_image":null,"bio":"Marcelo Cortes is a co-founder and the CTO of Faire, an online wholesale marketplace connecting mostly small brands to independent, local retailers.","website":null,"location":null,"facebook":null,"twitter":null,"meta_title":null,"meta_description":null,"url":"https://ghost.prod.ycinside.com/author/marcelo-cortes/"},"primary_tag":{"id":"61fe29efc7139e0001a7116d","name":"Essay","slug":"essay","description":null,"feature_image":null,"visibility":"public","og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"codeinjection_head":null,"codeinjection_foot":null,"canonical_url":null,"accent_color":null,"url":"https://ghost.prod.ycinside.com/tag/essay/"},"url":"https://ghost.prod.ycinside.com/how-to-maintain-engineering-velocity-as-you-scale/","excerpt":"Faire’s engineering team grew from five to over 100 engineers in three years. Throughout this growth, we were able to sustain our pace of engineering execution by adhering to four guiding principles.","reading_time":16,"access":true,"og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"email_subject":null,"frontmatter":null,"feature_image_alt":null,"feature_image_caption":null},{"id":"61fe29f1c7139e0001a71995","uuid":"ba2a3279-7514-4d7b-812e-a461a251f8fa","title":"How To Build an oEmbed Integration for Your Startup, and Why It’s Necessary","slug":"how-to-build-an-oembed-integration-for-your-startup-and-why-its-necessary","html":"<!--kg-card-begin: html--><p>Your startup isn’t going to have the same user-growth trajectory as Facebook.</p>\n<p>No one’s is. It doesn’t matter how good your idea or execution are, it’s just math.</p>\n<p>When Facebook launched, there were almost a billion people with access to a computer connected to the Internet.</p>\n<p>But there wasn’t anything connecting the people behind those computers. Nor was there much good content. The enormous network coupled with very low saturation made this one of the greatest arbitrage opportunities of all time.</p>\n<p>To quote <a href=https://www.ycombinator.com/"https://medium.com/matter/buzzfeeds-jonah-peretti-goes-long-e98cf13160e7/" title=\"BuzzFeed's Jonah Peretti Goes Long\">Jonah Peretti</a>:</p>\n<p><em>&#8220;There was no competition. There were things accidentally happening that sometimes would go viral, and then we were like one of maybe a few dozen people trying to actually make viral web culture, when nobody was doing it. So the networks were completely open in the sense that no one was even trying to make content that intentionally would go viral.</em></p>\n<p><em>There&#8217;s sometimes moments where networks are so amenable to spread. Duncan [Watts] uses this forest-fire analogy, which is if there&#8217;s a forest where the underbrush is wet, the trees are far apart, there&#8217;s not many dead trees, you could take a flamethrower to it and it won&#8217;t burn. If the forest is dry and it&#8217;s been hot and the trees are close together, you can just drop a match and the whole thing will burn. I think there was a period between 2001 and 2003 when the dry forest was ready to burn. If you made something that was pretty funny and you made something that had certain qualities that caused people to want to share and talk and discuss, then things would spread pretty far. Now you see people do a really cool project or a cool Tumblr and they don&#8217;t end up on the Today Show.&#8221;</em></p>\n<p>In this article we’ll first take a look at why having an embed has become increasingly important given the current state of the web. Then we’ll dive into how to actually implement this for your own startup.</p>\n<p>Specifically, we’ll take a deeper look at an open protocol called <a href=https://www.ycombinator.com/"https://github.com/iamcal/oembed/" title=\"oEmbed\">oEmbed</a>, and how to build an oEmbed integration that’s compatible with <a href=https://www.ycombinator.com/"http://embed.ly/" title=\"Embed.ly\">Embed.ly</a>. The idea is that this will make it super easy for anyone to directly embed your content within Reddit threads, Medium posts, Confluence pages, etc.</p>\n<p>As an added bonus, these embeds don’t have to be simple static pages, they can be fully interactive. To give an example, here is a quick video showing the embed we built for our own startup:</p>\n<p><iframe loading=\"lazy\" width=\"100%\" height=\"315\" src=https://www.ycombinator.com/"https://www.youtube.com/embed/jSZXYBDZt3g/" frameborder=\"0\" allowfullscreen></iframe></p>\n<p>Even though this looks complex, it’s actually quite simple.</p>\n<h2>Why build an embed?</h2>\n<ol>\n<li>\n<p><strong>Network saturation</strong> — As explained above, the best new products are no longer likely to go viral on their own. So how do you get users? In short, people won’t sign up for your website unless they see your content at least 100 times within the sites and apps they already visit. And of course just seeing your content isn’t enough; they need to get differentiated value each time, and connect the experience with your brand.</p>\n<p>For our startup the differentiated part is easy, because we’re (currently) the only platform for publishing email conversations. But if, for example, we had built a podcasting platform, it would require a ton of work to make sure that everyone clicking the play button was A) aware that the content was from this new platform rather than from Soundcloud, Spotify, etc., and B) had some basic understanding of the product and its benefits.</p>\n</li>\n<li>\n<p><strong>Single-player value</strong> — if your site is only useful once you have 100M users, you don’t have a viable business. Even if you’re building a social network, your product needs to provide value even if there is only one person using it.</p>\n<p>The canonical example here is Wikipedia, which got its start by importing the entire 1910 edition of the Encyclopaedia Britannica as well as the U.S. census data for every town and city. This caused Wikipedia articles to start ranking in Google results, where some percentage of people who landed on these pages began contributing their own content.</p>\n</li>\n<li>\n<p><strong>Unit economics</strong> — Although these days the phrase is mostly used when discussing on-demand startups, the fact is that unit economics are equally important for social sites and content platforms. At a fundamental level, for every minute users spend creating content on your site, there needs to be some sort of ROI in terms of pageviews, engagement, subscribes, or conversions. It doesn’t matter if we’re talking about publishing a blog post, a photo, a video, an email conversation, etc., at the end of the day the thing that matters is the ratio of time invested vs the outcomes you care about. Having an embed is a way of dramatically improving this ratio for your users.</p>\n</li>\n</ol>\n<p>Those who have been following the tech industry for a while probably know that building an embed has been pretty standard advice ever since YouTube attributed this feature to their viral growth after launching it in July 2005. So what exactly has changed? Back then embeds were primarily about driving site growth as a whole. Even though YouTube had hundreds of millions of pageviews, there were still only a few thousand videos uploaded per day, so pretty much every great video eventually got seen by everyone on the site.</p>\n<p>Whereas today users are primarily responsible for promoting their own content, so embedding isn’t just a feature you’ll need during the growth phase of your startup, but rather it’s a part of the core value proposition you’ll need to get your first 1,000 users.</p>\n<h2>The oEmbed protocol</h2>\n<p>The best way to make your content embeddable is to implement <a href=https://www.ycombinator.com/"http://oembed.com//">oEmbed, an open protocol for telling web platforms how to create an embeddable version of any piece of content.</p>\n<p>How does this work?</p>\n<p>The best way to explain is by example. Let’s say we have this email thread on our site about good hikes in southern Connecticut:</p>\n<p><a href=https://www.ycombinator.com/"https://www.fwdeveryone.com/t/e8RFukWTS5Wo54fBNbZ2yQ/good-hikes-southern-ct/" title=\"Good hikes for Southern CT\">https://www.fwdeveryone.com/t/e8RFukWTS5Wo54fBNbZ2yQ/good-hikes-southern-ct</a></p>\n<p>What we want is to take this thread and embed it within a Reddit post, like this:</p>\n<p><a href=https://www.ycombinator.com/"https://www.reddit.com/r/Connecticut/comments/65kbok/good_hikes_for_southern_ct//">https://www.reddit.com/r/Connecticut/comments/65kbok/good_hikes_for_southern_ct/

/n

So how do we do that? To start with, we need to build a special version of this article that&#8217;s designed to live within an iFrame:</p>\n<p><a href=https://www.ycombinator.com/"https://oembed.fwdeveryone.com/?thread-id=e8RFukWTS5Wo54fBNbZ2yQ\%22>https://oembed.fwdeveryone.com/?thread-id=e8RFukWTS5Wo54fBNbZ2yQ

\n

This iFrame is hosted on a separate subdomain of our site. All it contains is just enough html, css, and javascript to render an embedded thread on either desktop or mobile. As you can see from the link above, we’re passing in the thread id as a URL parameter. There is some javascript that takes this URL parameter and uses it to make a request to our API to get the text of the thread. Once that endpoint returns its data, the thread is rendered.</p>\n<p>There are two important considerations here:</p>\n<ol>\n<li>\n<p><strong>Speed</strong> — The process of rendering the content needs to be as fast as possible. Large media organizations aren’t going to use your embed if it slows down their page load time. In practice, this means the time it take your iFrame to fetch data and render content should be less than 300 milliseconds, preferably faster.</p>\n</li>\n<li>\n<p><strong>Responsiveness</strong> — Your content needs to look good across a wide range of page sizes. Even if you don’t care about your normal site supporting older iOS devices, the people running the sites you want your content embedded within might not want their pages looking broken to folks still using the iPhone 5.</p>\n</li>\n</ol>\n<p>So now we’re done right?</p>\n<p>Well, not quite. We need a way to tell sites like Reddit how to actually render our iFrame.</p>\n<p>How does this work?</p>\n<p>Let’s start with two basic vocabulary terms:</p>\n<p><strong>Provider</strong> — The party providing the content that they want embedded within sites like Reddit, Medium, etc.<br />\n<strong>Consumer</strong> — Sites like Reddit, Medium, Confluence, etc., which allow their users to render the content of third-parties within their platforms.</p>\n<p>In our case, we’re the provider.</p>\n<p>What we need to do next is build a GET endpoint on our website that accepts one or more URL query parameters, and uses these query parameters to return a JSON response containing the information needed to render that article. The query parameters this endpoint needs to accept are as follows:</p>\n<ul>\n<li><strong>url</strong> — The URL of the resource that a user on a consumer platform (e.g. a Redditor) wants to embed. If the URL isn’t a valid resource, then you’re required to return a 404 NOT FOUND error. If the URL is a valid resource, but it’s not publicly accessible or the person who wants to embed it doesn’t have permission, then you’re required to return a 401 UNAUTHORIZED error.</li>\n<li><strong>maxwidth</strong> — The consumer adds this query parameter to specify the maximum width (in pixels) they are willing to accept for your iFrame. If, for example, the your endpoint gets called with a maxwidth=280, but the minimum width you implement for your iFrame is 400px, then you are required to return a 501 NOT IMPLEMENTED error.</li>\n<li><strong>maxheight</strong> — Same deal as the above, but for height. E.g. if your endpoint gets called with maxheight=400, but your iFrame requires a height of at least 600px, then you need to return the 501 error.</li>\n<li><strong>format</strong> — This is optional, and can be either JSON or XML. If the consumer calls your endpoint with format=xml, and you don’t implement XML, then again just return a 501 NOT IMPLEMENTED error. (In our case we only implement JSON.)</li>\n</ul>\n<p>To simplify things, let’s look at the case where this endpoint is called with only the query parameter ‘url’, where the value is the URL of one of our email threads.</p>\n<p>In our case, this means hitting the following endpoint like so:</p>\n<p><a href=https://www.ycombinator.com/"https://api.fwdeveryone.com/oembed?url=https://www.fwdeveryone.com/t/e8RFukWTS5Wo54fBNbZ2yQ\%22>https://api.fwdeveryone.com/oembed?url=https://www.fwdeveryone.com/t/e8RFukWTS5Wo54fBNbZ2yQ

\n

This returns the following JSON response:</p>\n<pre><code>{\n \"version\": \"1.0\",\n \"type\": \"rich\",\n\n \"provider_name\": \"FWD:Everyone\",\n \"provider_url\": \"https://www.fwdeveryone.com\"\n\n \"author_name\": \"Alex Krupp\",\n \"author_url\": \"https://www.fwdeveryone.com/u/alex3917\",\n\n \"html\": \"&lt;iframe src=\\\"https://oembed.fwdeveryone.com?thread-id=e8RFukWTS5Wo54fBNbZ2yQ\\\" width=\\\"700\\\" height=\\\"825\\\" scrolling=\\\"yes\\\" frameborder=\\\"0\\\" allowfullscreen&gt;&lt;/iframe&gt;\",\n \"width\": 700,\n \"height\": 825,\n\n \"thumbnail_url\": \"https://ddc2txxlo9fx3.cloudfront.net/static/fwd_media_preview.png\",\n \"thumbnail_width\": 280,\n \"thumbnail_height\": 175,\n\n \"referrer\": \"\",\n \"cache_age\": 3600, \n}\n</code></pre>\n<p>Let’s quickly walk through each response parameter and explain what it means.</p>\n<ul>\n<li><strong>version</strong> is the version of the oEmded protocol we&#8217;re implementing. Basically you’re just required to add &#8220;version&#8221;: &#8220;1.0&#8221; to the JSON response.</li>\n<li><strong>type</strong> refers to the &#8216;type&#8217; of embed we&#8217;re implementing. Each embed type has a few JSON response parameters you’re required to implement. For example, if the thing you want to make embeddable is a photo, then your JSON response is required to contain the URL of the photo, as well as its width and height. The options for embed type are: ○ <strong>photo</strong> — For if the thing we want to make embeddable is a photo. ○ <strong>video</strong> — For if the thing we want to make embeddable is a video. ○ <strong>link</strong> — For if we just want this endpoint to return information about the URL, but we don’t actually have an embeddable iFrame. (This could be useful for things like providing extra information about a link when a user hovers over it, and/or showing a preview image for the URL.) ○ <strong>rich</strong> — For when we have an embeddable iFrame, but the content within that iFrame isn’t a single photo or a single video. (Most of the time this is the option you want.)</li>\n<li><strong>provider_name</strong> — The name of your website</li>\n<li><strong>provider_url</strong> — The URL of your website</li>\n<li><strong>author_name</strong> — The name of the author of specific piece of content we want to make embeddable. In our case, the name of the person who uploaded the email thread.</li>\n<li><strong>author_url</strong> — A link to the profile page of the person (or organization) above.</li>\n<li><strong>html</strong> — This response parameter is required for the ‘rich’ oEmbed type. Basically should be an iFrame HTML element that contains a src attribute with the URL to render the iFrame for the resource you want to embed, as well as a width and height attribute. In our case, we also add the attributes <em>scrolling</em>, <em>frameborder</em>, and <em>allowfullscreen</em> to specify how we want the embed to look visually. Some sites that consume oEmbed will respect these attributes, others will strip them out or change them to match the rest of the page stylistically. If your embed requires scrolling (like ours) but one of your consumers is stripping out that attribute, this usually just requires sending them an email explaining the situation.</li>\n<li><strong>width</strong> — The width the iFrame should be. If the consumer called this endpoint with the maxwidth query parameter, then the width here needs to be no greater than the maxwidth.</li>\n<li><strong>height</strong> — The height the iFrame should be. Same deal as above, but for maxheight.</li>\n<li><strong>thumbnail</strong> — The URL of an image you want to use as a thumbnail for this embed. In our case, we just use the logo for our website. But for threads with image attachments, we could instead use one of these images as the thumbnail. Having a thumbnail image is optional, but Reddit won’t render your embed unless you include one.</li>\n<li><strong>thumbnail_width</strong> — The width of the thumbnail image. (Consumers won’t necessarily render the thumbnail using the dimensions you specify.) </li>\n<li><strong>thumbnail_height</strong> — The height of the thumbnail image. (Same deal as above.)</li>\n<li><strong>referrer</strong> — The consumer of your content has the option to pass in a referrer string as a query parameter. If they do so, you should return this string in your JSON response.</li>\n<li><strong>cache_age</strong> — How long the consumer should cache the response for this endpoint.</li>\n</ul>\n<p>In our case we implemented this with Django Rest Framework, so here’s more-or-less what this looks like:</p>\n<pre><code>class OEmbed(APIView):\n def get(self, request):\n # The second param passed to the .get() method is the default value, which\n # is returned if the specified key (first param) isn't found in the dictionary.\n url = https://www.ycombinator.com/blog/tag/request.query_params.get('url', '')\n max_width = request.query_params.get('maxwidth', 0)\n max_height = request.query_params.get('maxheight', 0)\n resp_format = request.query_params.get('format', '')\n referrer = request.query_params.get('referrer', '')\n\n if resp_format and not resp_format == 'json':\n return Response(data={}, status=501)\n\n if max_width &lt; 280:\n return Response(data={}, status=501)\n\n if max_height &lt; 825:\n return Response(data={}, status=501)\n\n try:\n thread_id = utils.get_thread_id_from_url(url)\n except ObjectDoesNotExist:\n return Response(data={}, status=404)\n\n try:\n thread = thread_service.get_thread_from_thread_id(request, thread_id)\n except InvalidPermissionError:\n return Response(data={}, status=401)\n\n width = max_width if (max_width and max_width &lt;= 700) else 700\n height = 825\n\n resp = thread_service.build_oembed_response(thread, width, height, referrer)\n return Response(data=resp, status=status.HTTP_200_OK)\n</code></pre>\n<p>Note that for readability I’m omitting the query parameter sanitization, e.g. stripping any XSS from the referrer string.</p>\n<h2>Embed.ly</h2>\n<p>So are we done yet?</p>\n<p>In theory, yes, but in practice, not quite. The deal is that most mainstream oEmbed consumers use something called Embed.ly (YC w10), which makes it easier for consumers to implement the spec.</p>\n<p>The best way to explain the value that Embed.ly provides is to start by looking at what the embedding process looks like with the Embed.ly integration enabled:</p>\n<ol>\n<li>A user on an oEmbed consumer platform performs an action that would trigger the creation of an embed within that platform. For example, a Redditor submits the URL of an article on your site to Reddit, or a Medium author pastes the URL into a new blog post they’re writing.</li>\n<li>The oEmbed consumer platform checks to see if this URL is in their cached list of Embed.ly providers. If so, the site makes a request to Embed.ly with the URL.</li>\n<li>Embed.ly makes the request to the oEmbed endpoint on your website with the URL being requested, and gets the JSON response with all the information needed to render the embed. </li>\n<li>Embed.ly returns this JSON response to the oEmbed consumer, and then the consumer site uses this response to create the iFrame.</li>\n</ol>\n<p>The main benefits that Embed.ly provides are:</p>\n<ul>\n<li>A universal API to interact with each site that implements oEmbed, instead of the consumer platform needing to create an integration for each provider.</li>\n<li>Testing each provider’s endpoint before whitelisting their site to ensure they are implementing the spec correctly.</li>\n<li>Cacheing the JSON responses from the providers</li>\n<li>Providing a standardized way for iFrames to resize their own height after being embedded via the <a href=https://www.ycombinator.com/"https://developer.mozilla.org/en-US/docs/Web/API/Window/postMessage/">window.postMessage API. E.g. if your embed contains text, then you may want the height to increase as the width decreases.</li>\n</ul>\n<p>The full requirements for becoming an Embedly provider are listed here: <a href=https://www.ycombinator.com/"http://embed.ly/providers/new/">http://embed.ly/providers/new

/n

Once you’ve verified that your embed is working correctly and meets all of Embed.ly’s requirements, just submit it via the above link. If all goes well it should be approved within a few days.</p>\n<p>A couple miscellaneous tips:</p>\n<ul>\n<li>If your iFrame requires scrolling, you need to ask Embed.ly to enable this.</li>\n<li>Make sure your content is production-ready: gzipped, minified, static assets served from a CDN, etc.</li>\n<li>Ensure your iFrame has the correct <a href=https://www.ycombinator.com/"https://en.wikipedia.org/wiki/Canonical_link_element/">canonical URL</a>.</li>\n</ul>\n<p>So now are we done?</p>\n<p>Maybe. But there are a few quirks specific to each oEmbed consumer platform that are important to know about.</p>\n<h2>Getting whitelisted for Medium</h2>\n<p>In order to integrate with Medium, getting your Embed.ly integration enabled is the first step, but then need to get whitelisted by Medium. What they’re looking for is as follows:</p>\n<ul>\n<li>Has a Do Not Track policy that meets <a href=https://www.ycombinator.com/"https://medium.com/policy/how-we-handle-do-not-track-requests-on-medium-f2b4b4fb7c5e/">Medium’s standards</a>.</li>\n<li>Has branding that makes it clear the embed is a third-party service and not affiliated with Medium.</li>\n<li>Performance (doesn’t slow down the page and renders well on all platforms.)</li>\n<li>Security (Has a <a href=https://www.ycombinator.com/"https://developer.mozilla.org/en-US/docs/Web/HTTP/CSP/">CSP on the API and the iFrame, and is served only via HTTPS.)</li>\n</ul>\n<p>If integrating with Medium is your main reason for building the embed, you can try getting pre-approved by sending over some PDFs with design mockups. (But I don’t represent Medium and make no guarantees.)</p>\n<h2>Requirements for Reddit</h2>\n<p>Reddit doesn’t require any sort of whitelisting, so your content will be embeddable as soon as your Embed.ly integration goes live. There are a couple things to note though:</p>\n<ul>\n<li>Your oembed JSON response must return a thumbnail image in order for the embed to render, even though this is optional in the spec.</li>\n<li>Reddit does not implement native height resizing. So if your site is such that different pieces of content naturally have different heights, you either need to prerender your content to figure out the height, or else just choose a fixed height and make your content scrollable.</li>\n<li>Embedding will only work on subreddits that have media previews enabled, unless the user selects the “auto-expand media previews” option on their settings page. (For moderators, each subreddit has a settings page with an “expand media previews on comments pages” option.)</li>\n</ul>\n<h2>Slack unfurling</h2>\n<p>If your VCs told you that you need to integrate with Slack, then the thing you want to check out is their <a href=https://www.ycombinator.com/"https://medium.com/slack-developer-blog/everything-you-ever-wanted-to-know-about-unfurling-but-were-afraid-to-ask-or-how-to-make-your-e64b4bb9254/">Everything you ever wanted to know about unfurling</a> blog post.</p>\n<p>Slack doesn’t use Embed.ly. Instead you need to implement the <a href=https://www.ycombinator.com/"http://oembed.com/#section4\">Discovery section</a> of the oEmbed spec.</p>\n<!--kg-card-end: html-->","comment_id":"1099626","feature_image":null,"featured":false,"visibility":"public","email_recipient_filter":"none","created_at":"2017-06-13T02:59:08.000-07:00","updated_at":"2021-10-20T13:05:00.000-07:00","published_at":"2017-06-13T02:59:08.000-07:00","custom_excerpt":null,"codeinjection_head":null,"codeinjection_foot":null,"custom_template":null,"canonical_url":null,"authors":[{"id":"61fe29e3c7139e0001a71075","name":"Alex Krupp","slug":"alex-krupp","profile_image":"/blog/content/images/2022/02/alex.png","cover_image":null,"bio":"Alex is cofounder of FWD:Everyone. Before that he cofounded LaunchHear (YC W10).","website":null,"location":null,"facebook":null,"twitter":null,"meta_title":null,"meta_description":null,"url":"https://ghost.prod.ycinside.com/author/alex-krupp/"}],"tags":[{"id":"61fe29efc7139e0001a71174","name":"Advice","slug":"advice","description":null,"feature_image":null,"visibility":"public","og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"codeinjection_head":null,"codeinjection_foot":null,"canonical_url":null,"accent_color":null,"url":"https://ghost.prod.ycinside.com/tag/advice/"},{"id":"61fe29efc7139e0001a71196","name":"Technical","slug":"technical","description":null,"feature_image":null,"visibility":"public","og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"codeinjection_head":null,"codeinjection_foot":null,"canonical_url":null,"accent_color":null,"url":"https://ghost.prod.ycinside.com/tag/technical/"}],"primary_author":{"id":"61fe29e3c7139e0001a71075","name":"Alex Krupp","slug":"alex-krupp","profile_image":"https://ghost.prod.ycinside.com/content/images/2022/02/alex.png","cover_image":null,"bio":"Alex is cofounder of FWD:Everyone. Before that he cofounded LaunchHear (YC W10).","website":null,"location":null,"facebook":null,"twitter":null,"meta_title":null,"meta_description":null,"url":"https://ghost.prod.ycinside.com/author/alex-krupp/"},"primary_tag":{"id":"61fe29efc7139e0001a71174","name":"Advice","slug":"advice","description":null,"feature_image":null,"visibility":"public","og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"codeinjection_head":null,"codeinjection_foot":null,"canonical_url":null,"accent_color":null,"url":"https://ghost.prod.ycinside.com/tag/advice/"},"url":"https://ghost.prod.ycinside.com/how-to-build-an-oembed-integration-for-your-startup-and-why-its-necessary/","excerpt":"Your startup isn’t going to have the same user-growth trajectory as Facebook.No one’s is. It doesn’t matter how good your idea or execution are, it’s justmath.When Facebook launched, there were almost a billion people with access to acomputer connected to the Internet.But there wasn’t anything connecting the people behind those computers. Nor wasthere much good content.","reading_time":12,"access":true,"og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"email_subject":null,"frontmatter":null,"feature_image_alt":null,"feature_image_caption":null},{"id":"61fe29f1c7139e0001a71977","uuid":"6ebcb7b7-a6c8-4fc4-9470-59892cab4ac9","title":"How to Use Responsive Images","slug":"how-to-use-responsive-images","html":"<!--kg-card-begin: html--><p>In the world of responsive web design one core, yet complicated, spec can net you substantial reductions in page size across the device spectrum. In this post I’ll demystify the complexity in the responsive images spec so you can use these powerful HTML attributes on your site. In part 2 you will learn how to build your own responsive image workflow, with a <a href=https://www.ycombinator.com/"https://github.com/webflow/responsive-images-demo/">code demo</a> that distills our responsive image stack into a single file. Also, we’ll dive into how we automate responsive images at scale processing millions of images at Webflow with AWS Lambda.</p>\n<p>Let’s dive in!</p>\n<h3>Responsive Images on Today’s Web</h3>\n<p>The <code>&lt;img&gt;</code> element has been around for a long time. Give it a <code>src</code> attribute and you’re well on your way. The spec adds two new attributes which the browser uses to make an image responsive.</p>\n<p>The new attributes are <code>sizes</code> and <code>srcset</code>. To put it simply: <code>sizes</code> tells the browser how big the <code>&lt;img&gt;</code> will render, and <code>srcset</code> gives the browser a list of image variants to choose from. The goal is to hint to the browser which variant in <code>srcset</code> to start downloading as soon as possible.</p>\n<p>The browser takes the <code>srcset</code> and <code>sizes</code> attributes you provide, combines them with the window width and screen density it already knows about and can start downloading the correct image variant right after the html is parsed— before anything is rendered; before css and javascript are even loaded. Modern browsers with pre-fetching enabled can start downloading the correct variant before you even navigate to the page. That’s a huge end-user performance increase!</p>\n<p>To see this in action, check out <a href=https://www.ycombinator.com/"https://webflow.com/feature/responsive-images/">https://webflow.com/feature/responsive-images and open the network inspector, to see the browser loading the correct variants.</p>\n<h1>Responsive Attributes</h1>\n<h3>How to Use Srcset</h3>\n<p><code>srcset</code> is just a list of image variants. You can specify a pixel density next to each variant in the list like this <code>srcset=”http://variant-1.jpg 2x, http://variant-2.jpg 1.5x”</code>. However this format only solves for hardware, serving better quality images on better quality displays, and does little for responsive design.</p>\n<p>What you really want is to list variants by pixel width so that when your site is loaded on a mobile layout and rendered at 500px wide, or on a desktop layout at 750px wide it’ll only download the variant it needs to render that layout. The width-based format looks like this <code>srcset=”http://variant-1jpg 500w, http://variant-2.jpg 750w, http://variant-3.jpg 1000w, http://variant-4.jpg 1500w”</code>. The <code>w</code> here represents pixel width of the actual image file that the corresponding url points to.</p>\n<p><a href=https://www.ycombinator.com/"https://ycombinator.wpengine.com/wp-content/uploads/2017/04/webflow1.png/">\"webflow1\"\"webflow2\"79% of an analyst’s time</a> goes to data preparation. Data preparation is not only tedious, it steals time from analysis.</p>\n<p>A <em>data package</em> is an abstraction that encapsulates and automates data preparation. More specifically, a data package is a tree of serialized data wrapped in a Python module. Each data package has a unique handle, a revision history, and a web page. Packages are stored in a server-side registry that enforces access control.</p>\n<p><strong>Example: Bike for Your Rights</strong><br />\nSuppose you wish to analyze bicycle traffic on Seattle’s Fremont Bridge. You could locate the source data, download it, parse it, index the date column, etc. — <a href=https://www.ycombinator.com/"https://www.youtube.com/watch?v=_ZEWDGpM-vM\%22>as Jake Vanderplas demonstrates</a> — or you could install the data as a package in less than a minute:</p>\n<pre><code>$ pip install quilt # requires HDF5; details below\n$ quilt install akarve/fremont_bike\n</code></pre>\n<p>Now we can load the data directly into Python:</p>\n<pre><code>from quilt.data.akarve import fremont_bike\n</code></pre>\n<p>In contrast to files, data packages require very little data preparation. Package users can jump straight to the analysis.</p>\n<p><strong>Less is More</strong><br />\nThe Jupyter notebooks shown in Fig. 1 perform the same analysis on the same data. The notebooks differ only in data injection. On the left we see a typical file-based workflow: download files, discover file formats, write scripts to parse, clean, and load the data, run the scripts, and finally begin analysis. On the right we see a package-based workflow: install the data, import the data, and begin the analysis. The key takeaway is that file-based workflows require substantial data preparation (red) prior to analysis (green).</p>\n<p><a href=https://www.ycombinator.com/"https://ycombinator.wpengine.com/wp-content/uploads/2017/04/before_after.png/">\"before_after\"on GitHub</a>.)</p>\n<h1>Data Packages in Detail</h1>\n<p><strong>Get the Package Manager</strong><br />\nTo run the code samples in this article you’ll need HDF5 1.8 <sup id=\"footnoteid1\"><a href=https://www.ycombinator.com/"#footnote1\">1</a></sup> (here’s <a href=https://www.ycombinator.com/"https://github.com/quiltdata/quilt#installation\">how to install HDF5</a>) and the Quilt package manger:</p>\n<pre><code>$ pip install quilt\n</code></pre>\n<p><strong>Get a Data Package</strong><br />\nRecall how we acquired the Fremont Bridge data:</p>\n<pre><code>$ quilt install akarve/fremont_bike\n</code></pre>\n<p><code>quilt install</code> connects to a remote registry and materializes a package on the calling machine. <code>quilt install</code> is similar in spirit to <code>git clone</code> or <code>npm install</code>, but it <a href=https://www.ycombinator.com/"https://blog.quiltdata.com/its-time-to-manage-data-like-source-code-3df04cd312b8/">scales to big data, keeps your source code history clean, and handles serialization</a>.</p>\n<p><strong>Work with Package Data</strong><br />\nTo simplify dependency injection, Quilt rolls data packages into a Python module so that you can import data like you import code:</p>\n<pre><code># python\nfrom quilt.data.akarve import fremont_bike\n</code></pre>\n<p>Importing large data packages is fast since disk I/O is deferred until the data are referenced in code. At the moment of reference, binary data are copied from disk into main memory. Since there’s no parsing overhead, deserialization is <a href=https://www.ycombinator.com/"http://wesmckinney.com/blog/pandas-and-apache-arrow//">five to twenty times faster</a> than loading data from text files.</p>\n<p>We can see that <code>fremont_bike</code> is a group containing two items:</p>\n<pre><code># python\n&gt;&gt;&gt; fremont_bike\n&lt;GroupNode '/Users/akarve/quilt_packages/akarve/fremont_bike':''&gt;\nREADME\ncounts\n</code></pre>\n<p>A group contains other groups and, at its leaves, contains data:</p>\n<pre><code># python\n&gt;&gt;&gt; fremont_bike.counts.data()\n West Sidewalk East Sidewalk\nDate\n2012-10-03 00:00:00 4 9\n2012-10-03 01:00:00 4 6\n2012-10-03 02:00:00 1 1\n...\n[39384 rows x 2 columns]\n</code></pre>\n<p><strong>Create a Package</strong><br />\nLet’s start with some <a href=https://www.ycombinator.com/"https://drive.google.com/open?id=0Bxpxy4wQ033GZ2VTcTBkTzNYcTg\%22>source data</a>. How do we convert source files into a data package? We’ll need a configuration file, conventionally called <code>build.yml</code>. <code>build.yml</code> tells <code>quilt</code> how to structure a package. Fortunately, we don’t need to write <code>build.yml</code> by hand. <code>quilt generate</code> creates a build file that mirrors the contents of any directory:</p>\n<pre><code>$ quilt generate src\n</code></pre>\n<p>Let’s open the file that we just generated, <code>src/build.yml</code>:</p>\n<pre><code>contents:\n Fremont_Hourly_Bicycle_Counts_October_2012_to_present:\n file: Fremont_Hourly_Bicycle_Counts_October_2012_to_present.csv\n README:\n file: README.md\n</code></pre>\n<p><code>contents</code> dictates the structure of a package.</p>\n<p>Let’s edit <code>build.yml</code> to shorten the Python name for our data. Oh, and let’s index on the “Date” column:</p>\n<pre><code>contents:\n counts:\n file: Fremont_Hourly_Bicycle_Counts_October_2012_to_present.csv\n index_col: Date\n parse_dates: True\n README:\n file: README.md\n</code></pre>\n<p><code>counts</code> — or any name that we write in its place — is the name that package users will type to access the data extracted from the CSV file. Behind the scenes, <code>index_col</code> and <code>parse_dates</code> are passed to <code>pandas.read_csv</code> as keyword arguments.</p>\n<p>Now we can build our package:</p>\n<pre><code>$ quilt build YOUR_NAME/fremont_bike src/build.yml\n...\nsrc/Fremont_Hourly_Bicycle_Counts_October_2012_to_present.csv...\n100%|███████████████████████████| 1.13M/1.13M [00:09&lt;00:00, 125KB/s]\nSaving as binary dataframe...\nBuilt YOUR_NAME/fremont_bike successfully.\n</code></pre>\n<p>You&#8217;ll notice that <code>quilt build</code> takes a few seconds to construct the date index.</p>\n<p><strong>The build process has two key advantages: 1) parsing and serialization are automated; 2) packages are built <em>once</em> for the benefit of all users — there’s no repetitive data prep.</strong></p>\n<p><strong>Push to the Registry</strong><br />\nWe’re ready to push our package to the registry, where it’s stored for anyone who needs it:</p>\n<pre><code>quilt login # accounts are free; only registered users can push\nquilt push YOUR_NAME/fremont_bike\n</code></pre>\n<p>The package now resides in the registry and has a landing page populated by <code>src/README.md</code>. Landing pages look <a href=https://www.ycombinator.com/"https://quiltdata.com/package/akarve/fremont_bike/">like this</a>.</p>\n<p>Packages are private by default, so you’ll see a 404 until and unless you log in to the <a href=https://www.ycombinator.com/"https://quiltdata.com/">registry. To publish a package, use <code>access add</code>:</p>\n<pre><code>quilt access add YOUR_NAME/fremont_bike public\n</code></pre>\n<p>To share a package with a specific user, replace <code>public</code> with their Quilt username.</p>\n<h1>Reproducibility</h1>\n<p>Package handles, such as <code>akarve/fremont_bike</code>, provide a common frame of reference that can be reproduced by any user on any machine. But what happens if the data changes? <code>quilt log</code> tracks changes over time:</p>\n<pre><code># run in same directory as you ran quilt install akarve/fremont_bike\n$ quilt log akarve/fremont_bike\nHash Pushed Author\n495992b6b9109a1f9d5e209d6... 2017-04-14 14:33:40 akarve\n24bb9d6e9d80000d9bc5fdc1e... 2017-03-29 20:42:43 akarve\n03d2450e755cf45fbbf9c3635... 2017-03-29 17:40:47 akarve\n</code></pre>\n<p><code>quilt install -x</code> allows us to install historical snapshots:</p>\n<pre><code>quilt install akarve/fremont_bike -x 24bb9d6e9d80000d9bc5fdc1e89a0a77c40da33da5a054b05cdec29755ac408b\n</code></pre>\n<p><strong>The upshot for reproducibility is that we no longer run models on “some data,” but on specific hash versions of specific packages.</strong></p>\n<h1>Conclusion</h1>\n<p>Data packages make for fast, reproducible analysis by simplifying data prep, eliminating parsing, and versioning data. <strong>In round numbers, data packages speed both I/O and data preparation by a factor of 10.</strong></p>\n<p>In future articles we’ll virtualize data packages across Python, Spark, and R.</p>\n<p>To learn more visit <a href=https://www.ycombinator.com/"http://QuiltData.com/">QuiltData.com.

/n

Open Source</h1>\n<p>The Quilt client is open source. Visit our <a href=https://www.ycombinator.com/"https://github.com/quiltdata/quilt/">GitHub repository</a> to contribute.</p>\n<h1>Appendix: Command summary</h1>\n<p><a href=https://www.ycombinator.com/"https://ycombinator.wpengine.com/wp-content/uploads/2017/04/big-picture.png/">\"big-picture\"\"GetAccept\"GetAccept (YC W16) is an e-signature tool with document tracking and smart sales automation. Before launching our platform at the end of 2015 we spent months researching and working to configure the perfect hosting environment for our SaaS application and would like to share it with you.</p>\n<p>Many of you have probably experienced long application latencies, hassles pushing out new releases and sleepless nights restoring databases. We definitely have. So, we decided to put an end to this and design our SaaS application from the ground up with this in mind. We aimed for 100% application uptime and full redundancy on all layers. 18 months into the business we have kept to our goal–not a single second of downtime and no service windows.</p>\n<p>How did we achieve this?</p>\n<h3>No-ssh environment with full redundancy</h3>\n<p>We are a lean team and decided the whole environment needed to be easy to manage and upgrade/downgrade by anyone in the team. A no-ssh philosophy was implemented early–that means we should never have to login to a server or write any terminal commands. From a scalability and redundancy perspective, this turned out to be great since we’ve been forced to rethink the way we design the application and configure the environment. This also ensures better security and minimizes access to sensitive data. To keep a ssh-free environment we use only standard PHP components, default EC2-configuration and setup local development environments with same versions as the production environment.</p>\n<p>Using Amazon OpsWorks we created a PHP-layer, load balancer and applications for app and our API. We configured each stack using Chef, os-packages and <a href=https://www.ycombinator.com/"http://docs.aws.amazon.com/opsworks/latest/userguide/workingcookbook-json-override.html/">custom json-configuration</a>. Each region is setup as a Stack to give us better control and monitoring. This also means we can scale-up or down each region with a single click in OpsWorks and it also automatically handles code-deployments and load-balancing/redundancy. We use SQS for worker queues and the Elastic Beanstalk worker tier for autoscaled backend workers which has never failed for us. An easy way to configure Elastic Beanstalk instances is to use the .ebextensions folder for os-packages, yum and other commands. In the root folder you can also use cron.yaml to utilize workers to do scheduled tasks instead of configuring this manually on the servers.</p>\n<p>We put the pieces together using Amazon Route 53 for our DNS hosting. We have set it up to route a user of the application to the closest application-stack using GeoLocation routing policy. After that each region has its own load balancer to split traffic between instances and also handle redundancy in case a server goes down.</p>\n<p>For our database layers we were early testers of the new Aurora DB but had to turn to RDS for MySQL since they didn’t solve cross-region replication. To maximize speed and redundancy we replicate the database to each region. Each replica use load-balanced SSH-tunnels to communicate with the master DB. You probably get it by now but this means that a whole datacenter or even a region can be down and our application automatically redirect a user to the second closest region-host and can continue to work without disturbance.</p>\n<h3>Low latency for documents and video</h3>\n<p>Let’s take a step back first to understand why we would like the application hosted in multiple regions. In our business we work with displaying and signing documents combined with media such as video. We also have customers with a concern about hosting document data outside their own region (think US documents hosted in Europe or vice versa).</p>\n<p>The obvious solution here would be to use a CDN network that distributes files to local nodes upon request. In our case and after some testing we could see that the initial load time was drastically higher than hosting from S3 in the same region. Our solution here was to have each customer select the preferred document hosting region (S3 region) when creating the account and the one with lowest latency is selected as default.</p>\n<h3>No-touch deployment</h3>\n<p>We come from a world where it was accepted to spend hours building out, uploading and backing up old code just to release a small patch. And where a failing release could occupy developers up to a whole day trying to revert. Those days are gone and we have created an environment where each developer can release new updates and patches without any of these steps.</p>\n<p>Using GIT for our code repo we use the a commit-hook to parse the comment and do actions based on that. Using the AWS SDK we use built-in GIT support in OpsWorks to deploy the new version, upload static files to S3 and test the new release. For example, if a developer just commit a change it automatically deploys to our dev-environment for manual testing. If the comment also ends with a release command the git-server automatically deploys a new version of the application and workers to OpsWorks and Elastic beanstalk. The updated version is deployed and live on all servers around the world within 2 minutes with zero downtime thanks to rolling deployment. Yes, by doing this we give a lot of power to the developers but at the same time we minimize all the steps to deploy a release which have resulted in very few hiccups. As an added bonus customers get pretty impressed when we can fix and deploy a bug patch within a few minutes at best, sometimes while having the customer on the phone.</p>\n<h3>TL;DR</h3>\n<p>Here is our recipe to maximize uptime and redundancy with low latency:</p>\n<p>Amazon OpsWorks with stack for each region containing PHP layer and ELB with code base in private GIT repository RDS MySQL with region replica over redundant SSH-tunnels Elastic Beanstalk Worker connected to SQS queue for background jobs S3 bucket in each region for secure and fast storage of documents and videos CloudFront for static content such as images, jQuery, AngularJS Route53 DNS with GeoLocation routing policy and health checks</p>\n<p>Some resources to get you started:<br />\n<a href=https://www.ycombinator.com/"https://aws.amazon.com/documentation/opsworks//">https://aws.amazon.com/documentation/opsworks/
http://www.augustcouncil.com/~tgibson/tutorial/tunneling_tutorial.html
https://github.com/markomarkovic/simple-php-git-deploy
https://cloudnative.io/blog/2015/03/aws-route-53-best-practices/

/n

Good luck with your setups and feel free to reach out to us at <a href=https://www.ycombinator.com/"https://www.getaccept.com//">GetAccept if you want any help. We&#8217;d also like to discuss and debate the hosting environment above.</p>\n<!--kg-card-end: html-->","comment_id":"1099025","feature_image":null,"featured":false,"visibility":"public","email_recipient_filter":"none","created_at":"2017-04-21T00:58:37.000-07:00","updated_at":"2021-10-20T13:11:04.000-07:00","published_at":"2017-04-21T00:58:37.000-07:00","custom_excerpt":null,"codeinjection_head":null,"codeinjection_foot":null,"custom_template":null,"canonical_url":null,"authors":[{"id":"61fe29e3c7139e0001a7109d","name":"Jonas Blanck","slug":"jonas-blanck","profile_image":"/blog/content/images/2022/02/jonas.jpg","cover_image":null,"bio":"Jonas is a cofounder of GetAccept (YC W16). GetAccept is an e-signature tool with document tracking and smart sales automation.","website":null,"location":null,"facebook":null,"twitter":null,"meta_title":null,"meta_description":null,"url":"https://ghost.prod.ycinside.com/author/jonas-blanck/"}],"tags":[{"id":"61fe29efc7139e0001a71174","name":"Advice","slug":"advice","description":null,"feature_image":null,"visibility":"public","og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"codeinjection_head":null,"codeinjection_foot":null,"canonical_url":null,"accent_color":null,"url":"https://ghost.prod.ycinside.com/tag/advice/"},{"id":"61fe29efc7139e0001a71196","name":"Technical","slug":"technical","description":null,"feature_image":null,"visibility":"public","og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"codeinjection_head":null,"codeinjection_foot":null,"canonical_url":null,"accent_color":null,"url":"https://ghost.prod.ycinside.com/tag/technical/"}],"primary_author":{"id":"61fe29e3c7139e0001a7109d","name":"Jonas Blanck","slug":"jonas-blanck","profile_image":"https://ghost.prod.ycinside.com/content/images/2022/02/jonas.jpg","cover_image":null,"bio":"Jonas is a cofounder of GetAccept (YC W16). GetAccept is an e-signature tool with document tracking and smart sales automation.","website":null,"location":null,"facebook":null,"twitter":null,"meta_title":null,"meta_description":null,"url":"https://ghost.prod.ycinside.com/author/jonas-blanck/"},"primary_tag":{"id":"61fe29efc7139e0001a71174","name":"Advice","slug":"advice","description":null,"feature_image":null,"visibility":"public","og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"codeinjection_head":null,"codeinjection_foot":null,"canonical_url":null,"accent_color":null,"url":"https://ghost.prod.ycinside.com/tag/advice/"},"url":"https://ghost.prod.ycinside.com/how-getaccept-achieved-100-uptime/","excerpt":" GetAccept (YC W16) is an e-signature tool withdocument tracking and smart sales automation. Before launching our platform atthe end of 2015 we spent months researching and working to configure the perfecthosting environment for our SaaS application and would like to share it withyou.","reading_time":4,"access":true,"og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"email_subject":null,"frontmatter":null,"feature_image_alt":null,"feature_image_caption":null},{"id":"61fe29f1c7139e0001a7196d","uuid":"46bdd926-e9fa-4a1e-8e4f-180d6b5f854f","title":"From edge2cat to edge2anything with TensorFlow","slug":"from-edge2cat-to-edge2anything-with-tensorflow","html":"<!--kg-card-begin: html--><p>Unless you have been hiding under a rock for the past few months, you have likely seen Christopher Hesse’s <a href=https://www.ycombinator.com/"https://affinelayer.com/pixsrv//">demo of image-to-image translation</a> (a <a href=https://www.ycombinator.com/"https://www.tensorflow.org//">Tensorflow port of <a href=https://www.ycombinator.com/"https://github.com/phillipi/pix2pix/">pix2pix by Isola et al.</a>). In case you missed it, search for <a href=https://www.ycombinator.com/"https://www.google.com/#q=edge2cat&amp;*\">edge2cat</a>, and a whole new world of cat-infused artificial intelligence will be opened to you. The model is trained on cat images, and it can translate hand drawn cats to realistic images of cats! Here are a few of our personal favorite “edge” image to cat translations generated by Chris’s model, ranging from accurate to horrifying:</p>\n<p><a href=https://www.ycombinator.com/"https://ycombinator.wpengine.com/wp-content/uploads/2017/04/edge2anything-1.png/">\"edge2anything\"edge2anythinghere
Pachyderm has created a totally reusable and generic pipeline that takes care of all the training, pre-processing, etc. for you, so you can jump right into the fun parts! They utilize <a href=https://www.ycombinator.com/"https://medium.com/pachyderm-data/sustainable-machine-learning-workflows-8c617dd5506d#.mmwccp55c\">this machine learning pipeline template</a> (produced by the team at Pachyderm in collaboration with Chris) to show how easy it can be to deploy and manage image generation models (like those pictured above). Everything you need to run the reusable pipeline can be found <a href=https://www.ycombinator.com/"https://github.com/pachyderm/pachyderm/tree/master/doc/examples/ml/tensorflow/">here on Github</a>, and is described below.</p>\n<h1>The Model</h1>\n<p>Christopher Hesse’s image-to-image demos use a Tensorflow implementation of the Generative Adversarial Networks (or GANs) model presented in <a href=https://www.ycombinator.com/"https://arxiv.org/pdf/1611.07004v1.pdf/">this article</a>. Chris’s full Tensorflow implementation of this model can be found <a href=https://www.ycombinator.com/"https://github.com/affinelayer/pix2pix-tensorflow/">on Github</a> and includes documentation about how to perform training, testing, pre-processing of images, exporting of the models for serving, and more.</p>\n<p>In this post we will utilize Chris’s code in that repo along with a <a href=https://www.ycombinator.com/"https://github.com/dwhitena/pach-pix2pix/blob/master/Dockerfile/">Docker image</a> based on <a href=https://www.ycombinator.com/"https://hub.docker.com/r/affinelayer/pix2pix-tensorflow//">an image he created</a> to run the scripts (which you can also utilize in your experiments).</p>\n<h1>The Pipeline</h1>\n<p>To deploy and manage the model, we will execute it’s training, model export, pre-processing, and image generation in the reusable <a href=https://www.ycombinator.com/"http://pachyderm.io/pps.html/">Pachyderm pipeline</a> mentioned above. This will allow us to:</p>\n<ol>\n<li>Keep a rigorous historical record of exactly what models were used on what data to produce which results. </li>\n<li>Automatically update online ML models when training data or parameterization changes. </li>\n<li>Easily revert to other versions of an ML model when a new model is not performing or when “bad data” is introduced into a training data set.</li>\n</ol>\n<p>The general structure of our pipeline looks like this:</p>\n<p><a href=https://www.ycombinator.com/"https://ycombinator.wpengine.com/wp-content/uploads/2017/04/edge2anything-3.png/">\"edge2anythinglocal installation of Pachyderm</a>. Alternatively, you can quickly spin up a real Pachyderm cluster in any one of the popular cloud providers. Check out the <a href=https://www.ycombinator.com/"http://docs.pachyderm.io/">Pachyderm docs</a> for more details on deployment.</p>\n<p>Once deploy, you will be able to use the Pachyderm’s <code>pachctl</code> CLI tool to create data repositories and start our deep learning pipeline.</p>\n<h1>Preparing the Training and Model Export Stages</h1>\n<p>First, let’s prepare our training and model export stages. Chris Hesse’s <code>pix2pix.py</code> script includes:</p>\n<ul>\n<li>A “train” mode that we will use to train our model on a set of paired images (such as facades paired with labels or edges paired with cats). This training will output a “checkpoint” representing a persisted state of the trained model. </li>\n<li>An “export” mode that will then allow us to create an exported version of the checkpointed model to use in our image generation.</li>\n</ul>\n<p>Thus, our “Model training and export” stage can be split into a training stage (called “checkpoint”) producing a model checkpoint and an export stage (called “model”) producing a persisted model used for image generation:</p>\n<p><a href=https://www.ycombinator.com/"https://ycombinator.wpengine.com/wp-content/uploads/2017/04/edge2anything-4.png/">\"edge2anythingprocess.py script</a> to perform the resizing.</p>\n<p>To actually perform our image-to-image translation, we need to use a <a href=https://www.ycombinator.com/"https://github.com/affinelayer/pix2pix-tensorflow/blob/master/server/tools/process-local.py/">process_local.py script</a>. This script will take our pre-processed images and persisted model as input and output the generated, translated result:</p>\n<p><a href=https://www.ycombinator.com/"https://ycombinator.wpengine.com/wp-content/uploads/2017/04/edge2anything-5.png/">\"edge2anythinganother JSON specification</a>, <code>pre-processing_and_generation.json</code>, telling Pachyderm to: (i) run the <code>process.py</code> script in on the data in the “input_images” repository outputting to the “preprocess_images” repository, and (ii) run the <code>process_local.py</code> with the model in the “model” repository and the images in the “preprocess_images” repository as input. This can be done by running <code>pachctl create-pipeline -f pre-processing_and_generation.json</code>.</li>\n</ol>\n<h1>Putting it All Together, Generating Images</h1>\n<p>Now that we have created our input data repositories (“input_images” and “training”) and we have told Pachyderm about all of our processing stages, our production-ready deep learning pipeline will run automatically when we put data into “training” and “input_images.” It’s just works.</p>\n<p>Chris has provides a nice guide for preparing training sets <a href=https://www.ycombinator.com/"https://github.com/affinelayer/pix2pix-tensorflow#datasets-and-trained-models\">here</a>. You can use cat images, dog images, buildings, or anything that might interest you. Be creative and show us what you come up with! When you have your training and input images ready, you can get them into Pachyderm using the <code>pachctl</code> CLI tool or one of the Pachyderm clients (discussed in more detail <a href=https://www.ycombinator.com/"http://docs.pachyderm.io/en/stable/deployment/inputing_your_data.html/">here).

/n

For some inspiration, we ran Pachyderm’s pipeline with Google map images paired with satellite images to create a model that translates Google map screenshots into pictures resembling satellite images. Once we had our model trained, we could stream Google maps screenshots through into the pipeline to create translations like this:</p>\n<p><a href=https://www.ycombinator.com/"https://ycombinator.wpengine.com/wp-content/uploads/2017/04/edge2anything-6.png/">\"edge2anythingGitHub repo</a> to get the above reference pipeline specs along with even more detailed instructions. </li>\n<li>Join the <a href=https://www.ycombinator.com/"http://slack.pachyderm.io//">Pachyderm Slack team</a> to get help implementing your pipeline. </li>\n<li>Visit Chris’s <a href=https://www.ycombinator.com/"https://github.com/affinelayer/pix2pix-tensorflow/">GitHub repo</a> to learn more about the model implementation.</li>\n</ul>\n<!--kg-card-end: html-->","comment_id":"1099122","feature_image":null,"featured":false,"visibility":"public","email_recipient_filter":"none","created_at":"2017-04-14T01:25:58.000-07:00","updated_at":"2021-10-20T13:11:43.000-07:00","published_at":"2017-04-14T01:25:58.000-07:00","custom_excerpt":null,"codeinjection_head":null,"codeinjection_foot":null,"custom_template":null,"canonical_url":null,"authors":[{"id":"61fe29e3c7139e0001a71086","name":"Dan Whitenack","slug":"dan-whitenack","profile_image":"/blog/content/images/2022/02/dan-whitenack.jpg","cover_image":null,"bio":"Dan is a data scientist at Pachyderm (YC W15).","website":null,"location":null,"facebook":null,"twitter":null,"meta_title":null,"meta_description":null,"url":"https://ghost.prod.ycinside.com/author/dan-whitenack/"}],"tags":[{"id":"61fe29efc7139e0001a7116d","name":"Essay","slug":"essay","description":null,"feature_image":null,"visibility":"public","og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"codeinjection_head":null,"codeinjection_foot":null,"canonical_url":null,"accent_color":null,"url":"https://ghost.prod.ycinside.com/tag/essay/"},{"id":"61fe29efc7139e0001a71196","name":"Technical","slug":"technical","description":null,"feature_image":null,"visibility":"public","og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"codeinjection_head":null,"codeinjection_foot":null,"canonical_url":null,"accent_color":null,"url":"https://ghost.prod.ycinside.com/tag/technical/"}],"primary_author":{"id":"61fe29e3c7139e0001a71086","name":"Dan Whitenack","slug":"dan-whitenack","profile_image":"https://ghost.prod.ycinside.com/content/images/2022/02/dan-whitenack.jpg","cover_image":null,"bio":"Dan is a data scientist at Pachyderm (YC W15).","website":null,"location":null,"facebook":null,"twitter":null,"meta_title":null,"meta_description":null,"url":"https://ghost.prod.ycinside.com/author/dan-whitenack/"},"primary_tag":{"id":"61fe29efc7139e0001a7116d","name":"Essay","slug":"essay","description":null,"feature_image":null,"visibility":"public","og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"codeinjection_head":null,"codeinjection_foot":null,"canonical_url":null,"accent_color":null,"url":"https://ghost.prod.ycinside.com/tag/essay/"},"url":"https://ghost.prod.ycinside.com/from-edge2cat-to-edge2anything-with-tensorflow/","excerpt":"Unless you have been hiding under a rock for the past few months, you havelikely seen Christopher Hesse’s demo of image-to-image translation (a Tensorflow port of pix2pix by Isola et al. ). In caseyou missed it, search for edge2cat , and awhole new world of cat-infused artificial intelligence will be opened to you.","reading_time":5,"access":true,"og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"email_subject":null,"frontmatter":null,"feature_image_alt":null,"feature_image_caption":null}],"filter":"(Technical)","featured":null,"pagination":{"page":1,"limit":10,"pages":1,"total":7,"next":null,"prev":null}},"url":"/blog/tag/technical","version":null,"rails_context":{"railsEnv":"production","inMailer":false,"i18nLocale":"en","i18nDefaultLocale":"en","href":"https://www.ycombinator.com/blog/tag/technical","location":"/blog/tag/technical","scheme":"https","host":"www.ycombinator.com","port":null,"pathname":"/blog/tag/technical","search":null,"httpAcceptLanguage":"en, *","applyBatchLong":"Winter 2024","applyBatchShort":"W2024","applyDeadlineShort":"October 13","ycdcRetroMode":true,"currentUser":null,"serverSide":true},"id":"ycdc_new/pages/BlogList-react-component-9ff94ddc-25d5-4461-bc72-bd2515796fbb","server_side":true}" data-reactroot="">

Recent Posts (Technical)

All Posts

How to Use Responsive Images

by Marcelo Cortes4/28/2017

In the world of responsive web design one core, yet complicated, spec can netyou substantial reductions in page size across the device spectrum. In this postI’ll demystify the complexity in the responsive images spec so you can use thesepowerful HTML attributes on your site. In part 2 you will learn how to buildyour own responsive image workflow, with a code demo that distills our responsiveimage stack into a single file.

Data Packages for Fast, Reproducible Python Analysis

by Aneesh Karve4/25/2017

The tragedy of data science is that 79% of an analyst’s time goes to data preparation. Data preparation is not only tedious, it steals timefrom analysis.A data package is an abstraction that encapsulates and automates datapreparation. More specifically, a data package is a tree of serialized datawrapped in a Python module.

How GetAccept Achieved 100% Uptime

by Jonas Blanck4/21/2017

GetAccept (YC W16) is an e-signature tool withdocument tracking and smart sales automation. Before launching our platform atthe end of 2015 we spent months researching and working to configure the perfecthosting environment for our SaaS application and would like to share it withyou.

From edge2cat to edge2anything with TensorFlow

by Dan Whitenack4/14/2017

Unless you have been hiding under a rock for the past few months, you havelikely seen Christopher Hesse’s demo of image-to-image translation (a Tensorflow port of pix2pix by Isola et al. ). In caseyou missed it, search for edge2cat , and awhole new world of cat-infused artificial intelligence will be opened to you.