Ok, but we need more accurate estimates

We need more accurate estimates

Why this range is so wide?

This is something that most of us have heard at least one time. We received new work, we haven’t a clue about it, we looked at the pile of work in front of us with no single point of reference, and have been asked how long it would take. Regardless of the answer, there is one that is universally wrong and terrible, and doesn’t help anyone: I don’t know

Many of us are often requested to provide some estimates to help people understand how much time or money they will need to invest. Regardless of how bad the intial situation might seem, I believe that we can always provide an estimate - there are estimation methods we can apply, and usually we have more information about the work that we thought we had. I also haven’t met anyone who was interested in down-to-the-dollar accurate estimate. It doesn’t matter whether they’ve come from waterfall or agile world, we always spoke in ranges.

But even if we have a good understanding and apply an estimation technique, the results might not be very compelling to our team or stakeholders, especially when the estimate range comes in wider than expected (sic!). Some people might offer you witty advice to look at the work complexity and learn more about the requirements, but the complexity is only one of many factors that drive time and budget estimates. There are far more variables, and uncovering and approximating them can be only achieved by doing the work - this is also the best way to learn more about the complexity itself.

I’m a big fan of Monte Carlo simulations when it comes to estimation. Regardless of whether you apply Monte Carlo or another statistical method, you’re most likely going to end up providing a time range. In most cases, this range will become the basis for your project budget, and this range might come in quite wide.

“A more accurate estimate”

Computer said: “7 to 14 sprints”

Let’s walk through a sample scenario of estimating a delivery time for 46 imaginary user stories. First, we go through the backlog, do the relative sizing, run simulated sprint planning to get velocity distribution and plug data into a simulation model. We look with hope at the screen, and the computer says - 7 to 14 sprints.

Computer said 7 to 14 sprints

What? 7 to 14 sprints? That’s a wide range… and it will take a lot of explanation… We’ve checked the numbers, fixed a few of them manually to make them look better. No changes. The probabilities moved slightly, but the verdict remains - 7 to 14 sprints. We bring the result and the full story about our magnificent, yet slightly overwhelming estimation method to the client, and what we hear is:

-“That’s nice… but… can you give us a more accurate estimate?”

-“Sure, we can do everything on paper. The question is what do we want to achieve. Do you want me to tell you something that sounds nice so we can have few nights of sleep before the death march starts or should we talk about potential risks while we have time to react?”

-“But what if project sponsors don’t agree?”

-“How about - what if project sponsors agrees and we miss the target by 100%?”


Beware of the average

My stats teacher used to say - ‘People keep drowning in lakes that are 1m deep on average.’

The big temptation in our scenario is to figure out an acceptable contingency and narrow down the range near the average (possibly by using all the available power of wishful productive group thinking). We could return with a response that looks like the one below if we tried really hard, right?

A nice narrow estimate

-“Ah… 3 sprints range… now we’re talking. Great job!”

Hold on! What about the nasty tails on both sides of the estimate? Let’s take a step back - when I told you I’m going to deliver the scope between 7 and 14 sprints, what I was really saying was:

-“I’m 99.999% confident the work, as we know it today, will be completed between 7 and 14 sprints.”-

And when I narrowed down the range, I’m saying:

-“I’m 61% confident the work will be done between 9 and 11 sprints.”-

-“Only 61%?! But that’s not what we discussed!”

-“You wanted a nice and narrow estimate - there you go…“

What about the risks?

See the tails outside of the nice and narrow estimate - that’s where remaining 39% sits, and guess what - that’s a risk.

40% risk on both sides

-“What about the left side? You can finish earlier right?”

You’re right, according to the model there is a 16% chance that it’s going to happen. But fair enough, let’s just talk about the right hand side.

22% risk of not getting job done

When I give you a 3 sprints wide estimate around the average, we’re betting on a game in which you can win 4 out of 5 times. How much money would you put on that if I only gave you one chance?

Let me demonstrate it in a different way.

Cumulative chances

If we missed the top bar of our estimated time or budget, which can happen 22% of the time after the 11th sprint, part of the work would not be delivered. There are many ways of dealing with this situation - we could actually remove something from the scope - but for the sake of the argument, let’s pretend the backlog consists only of must-have-compliance-or-we-get-sued items. The scope has to be delivered.

In a scenario like this I would look at the model differently. Instead of looking at chances or cumulative chances of delivering the scope, I would look at the risk of not delivering the scope by each sprint:

Not delivering stories

How does it help with understanding the chances? According to our nice and narrow estimate, there is still a 22% chance we won’t finish the project on time. And even though the average says we should deliver in approximately 9 sprints, it’s close to 50/50. How much money would you put on a coin-flip bet?

Better narrow range?

One approach would be to move the nice narrow range to the right side (because we need to give ranges, right) and present an estimate 11 to 13 sprints with a 4% risk. Theoretically, it’s a better approach, but in practice we’re bloating the estimate in true waterfall fashion: Team says $X; project manager (with experience) knows it will be closer to $2*X while project sponsor reaches for 4*$X to the pocket because it’s not the first time when the team and project manager has gotten it wrong.

Better narrow range

What to do?

Yes, we have an ugly, wide estimate. It doesn’t make anyone happy and delivering such news is not easy. But we can deal with it.

First of all, don’t be afraid of the ugly estimate - explain your estimation approach, complexity, known factors and potential sources of risks. There is always a good opportunity to review both the model and the method. If both of them pass the review, you’ll probably end up with few more people on board, ready to support you. A wise person once wrote “Project management 101: bad news doesn’t get better with time.” Delivering an ugly estimate is much easier in the beginning rather than end of the project.

Secondly. Yes, today the estimate is ugly, but guess what, as we go through the project, we learn more about it. Base on this we can update the estimate with actual values and it will become more accurate (yay!). With cycles short enough, we will have enough time to react on these changes. Good news, it doesn’t look like it’s going to be longer than 14 sprints… There’s only 0.0001% chance.

And what about the scope creep?

What if the scope changes from 46 to 60 stories?

Well the initial model tells how much time we need to deliver 46 out of 60 stories in the original order. Also, it works at any given point in time - by saying that, today all we know is we have 46 stories in their initial priorities. When we add another 16 there are a few possible scenarios:

  • Add them to the tail - not very likely - in this scenario nothing changes in terms of the estimation for the first 46 stories in the order we agreed on.

  • Trade-in/out - likely for fixed budget/time scenarios - we have to estimate complexity of the new scope and trade it for a part of similar complexity from the original scope

  • We add and change priorities - almost every time - no problem, let’s crunch the numbers one more time. That’s the whole point, fixing estimates is like fixing a scope - it doesn’t help anyone. They have to be reviewed, monitored and updated to represent our best understanding of the goals.

Regardless what your scenario is, it’s much easier to engage in a discussion when you have a model that tracks your estimates, and it is implemented as part of the delivery process. In a similar way we are implementing a model that helps track requirements in a form of the living artefact - a backlog.

Simple PDF text extractor for Azure Data Lake Analytics

Writing custom PDF extrator for Azure Data Lake Analytics

For sometime now I’ve been working on a pet project that helps me with home budgeting and acts as a training ground for learning new things.

This time I was trying to experiment with Azure Data Lake Analytics to see if it can help with processing PDFs.

ADLA extractors

Out of the box, Azure Data Lake Analytics supports CSV, TSV and text files. Content of these formats can be read line by line using EXTRACT expression and selecting one of available extractors. After installing Cognitive extensions we get an access to image extractors that support graphic files.

In my case I wanted to scrape text from invoices and statements I receive electronically and process the data using ADLA itself. Given OCR being available in Cognitive extension, I initially though about rendering PDFs but it seemed like an overkill to me, especially when my PDFs have nicely organised text layers.

So, I’ve decided to write my own extractor

Putting custom extractor together

I found a really good way of understanding how to write Azure Data Lake UDOs (YAY for TLA!) by using Visual Studio 2017 U-SQL sample unit test template, available if you installed Azure Data Lake tooling.

ADLA U-SQL Unit Test Sample

This template creates a sample schema and test data for several types of custom U-SQL UDOs. What’s really cool, it’s not a simple scaffold for a unit test - instead it’s a complete example of how to work with and test your UDOs and how the input and output are being processed within your U-SQL script.

I’ve created new C# class library for U-SQL app using Visual Studio template. Which pretty much creates C# class library project with two additional dependencies and Visual Studio support for registering the assembly as a U-SQL extension.

ADLA Class library for U-SQL application

My user defined extractor is quite simple and relies on iTextSharp library. There is no rocket science there:

using System.Collections.Generic;
using iTextSharp.text.pdf;
using iTextSharp.text.pdf.parser;
using Microsoft.Analytics.Interfaces;

namespace PDFExtractor
    [SqlUserDefinedExtractor(AtomicFileProcessing = true)]
    public class PDFExtractor : IExtractor
        public override IEnumerable<IRow> Extract(IUnstructuredReader input, IUpdatableRow output)
            var reader = new PdfReader(input.BaseStream);
            for (var page = 1; page <= reader.NumberOfPages; page++)
                output.Set(0, page);
                output.Set(1, ExtractText(reader, page));
                yield return output.AsReadOnly();

        public string ExtractText(PdfReader pdfReader, int pageNum)
            var text = PdfTextExtractor.GetTextFromPage(pdfReader, pageNum, new LocationTextExtractionStrategy());
            // Encode new lines to prevent from line breaking in text editors,
            // I want nice line after line files
            return text.Replace("\r", "\\r").Replace("\n", "\\n");

After building and testing the extension (once again, big kudos to ADLA VS team who put unit test sample project). I was able to publish it to my ADLA catalog and try in the U-SQL script. Because I was using VS template for U-SQL class libraries, Visual Studio picked it up and enabled “Register Assembly” operation on the project.

Visual Studio support for registering U-SQL extension assembly

Because I used iTextSharp, I had to mark that my extension comes with external dependencies and voila I was able to extract text layer my PDFs in ADL

Registering assembly dialog

Using user defined extractor in U-SQL

In order to use my extension in U-SQL script I had to reference both my assembly and its dependency.


@pages =
    EXTRACT FileName string,
            PageNo int,
            Content string
    FROM "/input_files/{FileName}.pdf"
    USING new PDFExtractor.PDFExtractor();

OUTPUT @pages
TO "/pdf.csv"
USING Outputters.Csv(outputHeader : true);

And that’s it!

Drawing histograms using D3 and typescript

Drawing histograms using D3 and typescript

If you ever played with D3, you’ve most likely found https://bl.ocks.org - Mike Bostock’s site where he explains how to draw D3 charts from very basic to quite complex ones. On several occasions I followed his examples to draw charts on my sites.

Recently, I started working on a TypeScript app and had to port JS examples to TypeScript.

I used one of Mike’s samples here’s the histogram from his site to explain how to translate it to TypeScript and wrap in a form of an Angular2 component.

Click here to find a GIST with the full source code


In order to add this example to your TS app you’ll need following packages:

npm install d3-axis d3-random d3-scale d3-selection --save

Differences between JS and TS code


This example follows the modular approach for D3 v4. Finding the correct module to import is a matter of going to D3 v4 API reference and finding which d3-x module contains types you need.

import { select } from 'd3-selection';
import { scaleLinear } from 'd3-scale';
import { range, histogram, max } from 'd3-array';
import { format } from 'd3-format';
import { randomBates } from 'd3-random';
import { axisBottom } from 'd3-axis';

Using TS generics properly

When using TypeScript we need to specify types we use in generic classes.

And JavaScript like this

var x = d3.scaleLinear().rangeRound([0, width]);


let x = scaleLinear<number>().rangeRound([0, width]);

Avoding minor @types issues

For the d3-array version 1.2.0 passing x.domain() as histogram.domain() argument, raises a type error.

After drilling into d3-array.ts.d definition I’ve found this:

interface Histogram<T> {

    * 2: numeric data → [min, max]
    * @link https://github.com/d3/d3-array#histogram_domain
    // SOMEDAY => any → => $$.Orderable
    domain():(values:$$.Orderable[]) => any;
    domain(value:[$$.Orderable, $$.Orderable]):this;
    domain(value:(values:$$.Orderable[]) => any[]):this;


So, I decided to go with the third option and…

var bins = d3.histogram()


let generator = histogram<number>()
                .domain(d => x.domain())

Extra - using Angular2 component template with D3

In order to use D3 to modify part of your Angular template DOM, you’ll need to use ElementRef to pass the template into your component code.

import { Component, OnInit, ElementRef } from '@angular/core';
import { select } from 'd3-selection';

    template:'<h1>Hello histogram</h1>\
              <svg id="hist" width="960" height="500"></svg>'
export class HistogramComponent implements OnInit {

    el: HTMLElement;

    constructor(private elementRef: ElementRef){
        this.el = elementRef.nativeElement;

    ngOnInit(): void {

        let hist = select(this.el).select('#hist');
        // Follow with your regular D3 flow

Click here to find a GIST with the full source code

Configuring Angular2 with ASP.NET Core

Angular2 with ASP.NET Core - Step-by-step guide


This guide is created on a basis of Angular 2 - Quickstart. You can refer to the Angular2 - Quickstart guide to check exact contents of application and configuration files.

In this guide I’m using following versions:

λ node -v 
λ npm -v
λ dotnet --version

On top of that I’m using Visual Studio 2015 Update 2. Given how quickly dotnetcore changes, some steps may look differently.

I’ve recorded all steps as commits in my Github repository: https://github.com/random82/angular2-aspnet-core

Step 1 - Create new ASP.NET Core MVC application

Create a new ASP.NET Core Web Application project using Visual Studio.

Create new project with Visual Studio

Select Web Application template.

Select Web Application

You should be able to see a similar folder structure being created.

Web Application - default folder structure

We’re going to use this folder structure when configuring Angular2 dependencies and the application itself.

Step 2 - Modify gulpfile.js in the project folder

Angular2 Quickstart recommends using NPM to manage Angular libraries. I’m going to follow this recommendation. By default an ASP.NET Core application, similar to one we’ve just created, stores the front end files inside wwwroot folder. I’m going to keep it this way and use Gulp to move dependencies around and put them in a folder that will be automatically deployed with the web application.

Add npm folders to gulpfile

Inside gulpfile.js you will find paths variable. We have to modify it to keep source and destination folders of our new dependencies.

var paths = {
    js: webroot + "js/**/*.js",
    minJs: webroot + "js/**/*.min.js",
    css: webroot + "css/**/*.css",
    minCss: webroot + "css/**/*.min.css",
    concatJsDest: webroot + "js/site.min.js",
    concatCssDest: webroot + "css/site.min.css",

    npmLibSrc: "./node_modules/",
    npmLibDest: webroot + "lib/npm"

Add clean task for npm folders

The next step would be to use the destination folder, we just added to paths, and add a new clean task. This way Gulp will delete dependencies when we need to refresh our application.

gulp.task("clean:npmlib", function (cb) {
    rimraf(paths.npmLibDest, cb);

Now, we have to add the new clean task as a global clean task dependency.

gulp.task("clean", ["clean:js", "clean:css", "clean:npmlib"]);

Create copy tasks for Angular modules and dependencies

In order to move Angular dependencies from node_module folder in our project root folder to wwwroot folder we need to copy required files.

gulp.task("copy:systemjs", function () {
    return gulp.src(paths.npmLibSrc + '/systemjs/dist/**/*.*', { base: paths.npmLibSrc + '/systemjs/dist/' })
        .pipe(gulp.dest(paths.npmLibDest + '/systemjs/dist/'));

gulp.task("copy:angular2", function () {
    return gulp.src(paths.npmLibSrc + '/@angular/**/*.js', { base: paths.npmLibSrc + '/@angular/' })
        .pipe(gulp.dest(paths.npmLibDest + '/@angular/'));

gulp.task("copy:core-js", function () {
    return gulp.src(paths.npmLibSrc + '/core-js/**/*min.js', { base: paths.npmLibSrc + '/core-js/' })
        .pipe(gulp.dest(paths.npmLibDest + '/core-js/'));

gulp.task("copy:rxjs", function () {
    return gulp.src(paths.npmLibSrc + '/rxjs/**/*.js', { base: paths.npmLibSrc + '/rxjs/' })
        .pipe(gulp.dest(paths.npmLibDest + '/rxjs/'));

gulp.task("copy:zone.js", function () {
    return gulp.src(paths.npmLibSrc + '/zone.js/dist/*.*', { base: paths.npmLibSrc + '/zone.js/dist/' })
        .pipe(gulp.dest(paths.npmLibDest + '/zone.js/dist/'));

gulp.task("copy:angular-in-memory", function () {
    return gulp.src(paths.npmLibSrc + '/angular2-in-memory-web-api/*.js', { base: paths.npmLibSrc + '/angular2-in-memory-web-api/' })
        .pipe(gulp.dest(paths.npmLibDest + '/angular2-in-memory-web-api/'));

gulp.task("copy:reflect-metadata", function () {
    return gulp.src(paths.npmLibSrc + '/reflect-metadata/*.*', { base: paths.npmLibSrc + '/reflect-metadata/' })
        .pipe(gulp.dest(paths.npmLibDest + '/reflect-metadata/'));

After creating the copy tasks, we can group them so they will be more manageable.


Now, we can create a new task we will use for publishing our front end application. It will run the default min task and copy-dep we’ve just created.

gulp.task("publish", ["min", "copy-dep"]);

From this point we can run gulp publish from the root folder of our project to update front end application files.

Upgrade project.json file with new tasks

In order to include our new gulp tasks in the Visual Studio build script, we need to modify project.json file prepublish scripts.

    /* Removed for brevity */
    "scripts": {
        "prepublish": [ "npm install", "bower install", "gulp clean", "gulp publish" ],
        "postpublish": [ "dotnet publish-iis --publish-folder %publish:OutputPath% --framework %publish:FullTargetFramework%" ]

At this point we should have all our dependencies put in the right place and we can start configuring Angular2 to build our application.

Step 3 - Configure SystemJS

In order to configure SystemJS we need to create a systemjs.config.js file inside wwwroot folder. I used SystemJS configuration file from Angular2 - Quickstart guide. However, in order to load files from a wwwroot folder it has to be modified respectively.

 * System configuration for Angular 2 samples
 * Adjust as necessary for your application needs.
(function(global) {
// map tells the System loader where to look for things

var libFolder = 'lib/npm/';

var map = {
    'app':                        'app', // 'dist',
    '@angular':                   libFolder + '@angular',
    'angular2-in-memory-web-api': libFolder + 'angular2-in-memory-web-api',
    'rxjs':                       libFolder + 'rxjs'

/* Removed for brevity */

Step 4 - Create Angular2 application

Now, it’s time to start building our Angular2 application. Since the purpose of this guide is to walk through the integrations with ASP.NET Core application I will be using sample files from Angular 2 - Quickstart


import { Component } from '@angular/core';
    selector: 'my-app',
    template: '<h1>My First Angular 2 App</h1>'
export class AppComponent { }


import { bootstrap }    from '@angular/platform-browser-dynamic';
import { AppComponent } from './app.component';

Step 5 - Configure TypeScript transpiler

Create a tsconfig.json file in the root directory of your web application project

    "compilerOptions": {
        "target": "es5",
        "module": "commonjs",
        "moduleResolution": "node",
        "sourceMap": true,
        "emitDecoratorMetadata": true,
        "experimentalDecorators": true,
        "removeComments": false,
        "noImplicitAny": false

Rebuild the project with Visual Studio. If everything is configured correctly, you should be able to see generated JS files in the Visual Studio

Succesful TypeScript transpilation

Step 6 - Embed your Angular2 application in your ASP.NET Core MVC app

Now it’s time to embed our Angular2 component in a Razor view so it will be served by the ASP.NET Core MVC application. In order to do it, we’ll modify the default layout file to load all the libraries required by Angular2.

Modify Views\Shared\_Layout.cshtml

Inside _Layout.cshtml file, find the <body> element and replace it with the one below.


    <environment names="Development">

        <!-- Polyfill(s) for older browsers -->
        <script src="~/lib/npm/core-js/client/shim.min.js"></script>
        <script src="~/lib/npm/zone.js/dist/zone.js"></script>
        <script src="~/lib/npm/reflect-metadata/Reflect.js"></script>
        <script src="~/lib/npm/systemjs/dist/system.src.js"></script>
        <!-- 2. Configure SystemJS -->
        <script src="systemjs.config.js"></script>
            System.import('app').catch(function(err){ console.error(err); });
    <environment names="Staging,Production">


    @RenderSection("scripts", required: false)

Modify Views\Home\Index.cshtml

For the purpose of this guide I’ll be using the home\index.cshtml view that is being load when we navigate to the root URL of our web app.

In order to load your Angular2 component just simply replace the content of the home\index.cshtml file with snippet below.

    ViewData["Title"] = "Home Page";

<my-app>Loading my app</my-app>

Step 7 - Run your application

Run your application using Visual Studio, if everything is configured properly your default browser should get opened with a new home index view like this:

Hello Angular 2


If you application does not appear correctly, check if NPM dependencies are properly loaded. Try running gulp publish from the root folder of your web application after VS rebuild. Sometimes Visual Studio build does not execute gulp tasks correctly

Do we really need it right now?

So, I’ve noticed you’re working on…

Imagine a scenario - you’re surrounded by passionate people who love what they do, they are eager to try cool stuff and they quickly turn their visions into outcomes.

Similar scenario - a development team that had been recently enabled and is taking an ownership of more things. They’re excited about doing things differently Not only in development but infrastructure, automation, process changes and everything else that they can put their hands on.

People around you keep trying new stuff, configuring more tools and working on improvements with happy faces

Awesome! Right? Well, it depends. The question worth asking is: do we really need that improvement right now?


- Hey, I’ve noticed you’re were working last 3 days on our new release tool…

- Sure! We are doing most of the stuff in the old way that holds us back. Thanks to what I’m after we will help us to have better control on our release frequency. It covers builds, deployments, automated testing with a single push of a button

- That’s cool, but it looks like we’re far away from our sprint goal and we made commitments. You’ve been there with us when we planned the sprint… BTW - Are you sure we need all of these now?

- So, you don’t want me to work on improvements?! We need to improve things! Sooner or later we’ll need them!

- You’re right and I like your commitment, there will be no value of having any improvement when we fail in delivering features…

- So, should I drop what I’m doing?

- Let me think… How much it takes to have proper builds? Our current setup is painful…

- I already got it covered

- Awesome! Let’s use this part for now and make sure you put rest of your ideas to the backlog. We’ll talk to the team, hopefully we will find some space to work on them in the next sprint

Improvements and technical procrastination

- Have you done your homework? Not yet, I have to check on something in the Internet first

Working on a new stuff is exciting. It keeps us motivated. The problem is - sometimes it keeps us distracted from our main commitments. The bigger problem is - sometimes these activities are hidden from a product owner, and sometimes they are even hidden from a team. Lurking in a shadow of secrets till our internal change agent finishes its work and we show our new baby to the team…

Guns still blazing, eyes full of passion, ready to show the latest improvement but… there is no one to listen… The rest of the team is busy with working on features, and they had more to do because one of their team members didn’t communicate about his ideas and plans.

Sometimes working on features is not the most exciting things we do given all new tools and practices around us. However, we can’t use an improvement as an excuse from doing a work that the rest of the team committed to deliver. Also we can’t hide our commitments from the team.

Improvements as user stories

So, no improvements then? No! Improvements are one of the most important things in our work. We should keep questioning our status quo and look for opportunities to reduce waste and make things going smoother. However, we have to remember working on them adds up to team commitments.

How to manage them efficiently? We can try a similar way we manage other commitments. We can put them as requirements with an expected outcome and preferably, an estimated value. They can be also expressed as user stories

A very naive example could be:

-As a product owner I want to release daily to get users feedback more frequently

We can manage improvements with a backlog. Ideally it should be the same backlog that team uses for a product or their work stream. It will help with keeping things transparent and trigger discussions about a hypothetical value of each improvement. Also, they should go through refinements, prioritisation and planning - in the same way as other user stories.

Improvements should have an owner and be prioritised. Once again, having a Product Owner who understands how technical improvements can glue with concepts like time to market would be awesome. Improvements’ owner should be capable of explaining a hypothetical value of implementing them to the rest of an organisation.

Don’t be a cowboy change agent

  • Have a vision
  • Keep pushing limits
  • Prioritise work
  • Don’t overthink your problems
  • Keep changes small and lightweight - smaller, simpler pieces will be adapted faster
  • Limit your improvement WIP - don’t overwhelm everyone with number of changes
  • Be patient, you’ll get there
  • Get things done

blog built using modified cayman-theme by Jason Long. LICENSE