28 Sep 2016, 13:15

gRPC wrong types context and Go 1.7

If you are following Go development you probably know that:
Go 1.7 moves the golang.org/x/net/context package into the standard library as context, Yeah !
Unfortunately it won’t work for everything, I’ve spent some time understanding this one.

For example if you are using gRPC you can hit this problem, here is an interface generated by gRPC:

type APRSServer interface {
    GetPastMessages(context.Context, *Point) (*ARPSMessages, error)
}

But when compiling:

/main.go:153: cannot use &s (type *Server) as type protorpc.APRSServer in argument to protorpc.RegisterAPRSServer:                                                                           
        *Server does not implement protorpc.APRSServer (wrong type for GetPastMessages method)                                                                                                
                have GetPastMessages("context".Context, *protorpc.Point) (*protorpc.ARPSMessages, error)                                                                                      
                want GetPastMessages("golang.org/x/net/context".Context, *protorpc.Point) (*protorpc.ARPSMessages, error)   

The compiler is complaining about wrong types for the context argument.
Problem is gRPC generated code is importing context as golang.org/x/net/context, that’s the only way it remains compatible between Go 1.6 & Go 1.7.

So a quick solution is to import context in your own code to use the old path:

// we have to use the old context path here for gRPC compat see https://github.com/grpc/grpc-go/issues/711
context "golang.org/x/net/context"

Also note that go tool fix is now capable of fixing the import path with -force context.

22 Sep 2016, 15:39

Using Go mobile on iOS for real

Go Mobile can generate native framework for iOS and Android using Go code, I was curious what could be achieved with it.
Most tutorials are Hello world and I wanted to test it with real code.
You can use it to generate a full app only using Go code, but I’m only interested by the bindings part (SDK applications), using a native ObjC/Swift app calling Go code.

I’m using some existing Go code regionagogo, (a geofence database), moderately complex since it uses BoltDB and Google S2 library.

Go Mobile is limited to a subset of types you can use, main reason is to be correctly transcribed to Java and ObjC.

So the first thing to do is to write some Go wrapper like you would do in ObjC to call some C++ code but first:

Functions must return either no results, one result, or two results where the type of the second is the built-in error type.
So far no major complication, but also note that you can’t return slices, that could be a major pain, there are some workaround solutions like go-mobile-collection that can generate an API to operate on slice.
You also have to respect some naming for your constructor like New...().

At the end your wrapper won’t look very Goish but it’s what it takes for it to be translated.

Knowing those constraints you can write a simple wrapper like this:

package mobile

type GeoDB struct {
	db regionagogo.GeoFenceDB
}

func NewGeoDB() *GeoDB {
	g := &GeoDB{}
	return g
}

func (g *GeoDB) OpenDB(path string) error {
	db, err := rbolt.NewGeoFenceBoltDB(path)
	if err != nil {
		return err
	}
	g.db = db
	return nil
}

...

Then use the gomobile command:

gomobile bind -target=ios github.com/akhenakh/regionagogo/mobile

It generates a native iOS Mobile.Framework, drop it into your XCode project and start using it.

  • GeoDB translates to GoMobileNewGeoDB() and returns a GoMobileGeoDB* in ObjC domain.
  • OpenDB(path string) error to - (BOOL)openDB:(NSString*)path error:(NSError**)error.

A simple example would be:

GoMobileGeoDB *db = GoMobileNewGeoDB();
NSString *resourcePath = [[NSBundle mainBundle] pathForResource:@"region" ofType:@"db"];
        
NSError *error;
[db openDB:resourcePath error:&error];
if (error != nil) {
    NSLog(@"error opening db %@", [error localizedDescription]);
    return;
}

It’s performing very well, on my iPhone 6, around 4,000 queries per second to test a position in a small fence, to 60,000 queries in a hit miss (calling Go overhead is not that big), while using a ridiculously small amount of memory.

I’ve made a demo iOS app where you can hit the map and it tells you in which fence you are.

app
Remember this is running locally on your phone without any network access (but the map), the geo computation and the Polygons are returned by Go code.

Until now I had to maintain libraries in both languages, Go mobile is a nice alternative!

Sources for the wrapper are available in regionagogo gomobile branch, and here is the iOS demo app.

20 Sep 2016, 17:40

gRPC Envoy Nghttp2 and Load Balancing

I’ve been using gRPC at work and in several personal projects for months and happy with it, but when it comes to load balancing gRPC does not come with batteries included.

For a long time the only document was the Load Balancing draft in the gRPC repo, the clients should implement a Picker interface to know about the servers, so the pooling and controling the load were handled by the clients.
HTTP/2 was new and most of the reverse proxies implementations were not capable of load balancing gRPC HTTP2 frames, the only solution was to use a TCP load balancer, generating errors, improper and weird behaviours for the clients.

At least two projects are now supporting gRPC load balancing easily.

  • The recently announced Envoy from Lyft
  • And nghttpx from nghttp2

Here are some notes to simply load balance two gRPC Helloworld server! running on ports 50050 & 50051.

  • For nghttp2, a simple configuration file will do

    frontend=*,8000;no-tls
    backend=localhost,50050;;no-tls;proto=h2;fall=2;rise=2
    backend=localhost,50051;;no-tls;proto=h2;fall=2;rise=2
    workers=8
    
  • For envoy, here is the cluster part

    "clusters": [
      {
        "name": "local_service",
        "connect_timeout_ms": 250,
        "type": "static",
        "lb_type": "least_request",
        "features": "http2",
        "hosts": [
          {
            "url": "tcp://127.0.0.1:50050"
          },
          {
            "url": "tcp://127.0.0.1:50051"
          }
        ]
      }
    ]
    

You can then tweak the greeter_client to loop for requests, so you can simulate a client doing multiple requests while killing/restarting your servers.

for {  
    r, err := c.SayHello(context.Background(), &pb.HelloRequest{Name: name}) 
    if err != nil { 
        log.Printf("could not greet: %v", err)
    }
    log.Printf("Greeting: %s", r.Message)
}

And modify the greeter_server to show on which port/server you get your response:

// SayHello implements helloworld.GreeterServer
func (s *server) SayHello(ctx context.Context, in *pb.HelloRequest) (*pb.HelloReply, error) {
    return &pb.HelloReply{Message: "Hello " + in.Name + port}, nil 
}

Those tests aren’t enabling any TLS so use grpc.WithInsecure().

Note that Envoy is also capable of bridging your HTTP/1.1 queries to gRPC, which is a killer feature (I haven’t tested it yet) , you would normally do it by code with gRPC-gateway.

Envoy is really new and I’m still digging into but already proves itself to be a complete load balancing proxy solution with or without gRPC in your stack.

03 Jul 2016, 08:15

Enabling Gometalinter with Jetbrains Editors

I’ve been using Jetbrains editor (the free Idea community edition) or Pycharm with the Go plugins and very happy with this setup, the editor is providing some realtime linting but I was missing gometalinter.

First install gometalinter

go get -u github.com/alecthomas/gometalinter
gometalinter --install --update

To add support inside Jetbrains editors use the External tools feature.
In Preferences > Tools > External Tools, add a configuration.

gometalinter setup for jetbrains

Set the Program path to your $GOPATH/bin/gometalinter
Parameters to --deadline=15s --vendor $FileDir$
Working directory to $ProjectFileDir$

For the linters messages to be clickable and jump to code add an Output filter with Regular expression to match output $FILE_PATH$:$LINE$

gometalinter setup filter for jetbrains

08 Mar 2016, 11:23

A geo database for polygons, optimization

If you read this blog, you know I’ve recently released a project called regionagogo, a geo shape lookup database, described in this blogpost.

It uses the current Go S2 implementation, which is not yet as complete as the C++ implementation, for example the region coverer of a shape does not really compute cell around the shape but around the bounding box instead.

Using the shape of the polygon makes the covered cells more precise and smaller, resulting at the end to less PIP tests which are costly.

S2 over a shape
The same coverage with Go S2 would have returned 8 big cells of the same size, covering over regions.

I’ve created regionagogogen a quick and dirty command line program for OSX that takes a GeoJSON file containing your regions and then compute the database using the S2 C++ implementation, it’s for OSX only cause I’m using ObjC as a bridge to C++ which I don’t know enough.

Also note that regionagogogen includes an S2 port that works on iOS/OSX and some ObjC geo helper classes.

The Docker image fore regionagogo is also using the optimized database.

18 Feb 2016, 17:16

A geo database for polygons, foundations

On a previous post, I’ve described how to use the S2 geo library to create a fast geo database, but it was to store locations (points) and only to perform range queries, a complete geo database would have regions/polygons queries.

Looking for a solution

I had this need: querying for the countries or subregions of hundreds of coordinates per second, without relying on an external service.

One solution, using my previous technique, could have been to store every cities in the world and then perform a proximity query around my point to get the closest cities, but it works only in populated area and it’s only an approximation.

I looked into others solutions, there is some smart ideas using UTF-grid, but it’s a client side solution and also an approximation tied to the resolution of the computed grid.

S2 to the rescue

S2 cells have some nice properties, they are segments on the Hilbert curve, expressed as range of uint64, so I had the intuition the problem to perform fast region lookup could be simplified as find all mathematical segments containing my location expressed as an uint64.

S2 over a country

Using a Segment Tree datastructure, I first tried an in memory engine, using Natural Earth Data, loading the whole world countries shapes into S2 loops (a Loop represents a simple spherical polygon), transforming then into cells using the region coverer, it returns cells of different levels, add them to the segment tree.

Segment Tree

To query, simply tranform the location into an S2 Cell (level 30) and perform a stubbing query that intersects the segments, every segments crossed are cells that covered a part of a Loop.
It will reduce the problem to test a few Loop vs thousands of them, finally perform ContainsPoint against the found loops cause the point could be inside the Cell but not inside the Loop itself.

Et voilà! It works!

The segment tree structure itself is very low on memory, the loops/polygons data could be stored on disk and loaded on requests, I’ve tested a second implementation using LevelDB using this technique.

If you have a very large tree (for example cities limits for the whole world), you can even put the segment tree on a KV storage, using this paper Interval Indexing and Querying on Key-Value Cloud Stores.

Region a gogo

As a demonstration here is a working microservice called regionagogo, simply returning the country & state for a given location.
It loads geo data for the whole world and answers to HTTP queries using small amout of memory.

GET /country?lat=19.542915&lng=-155.665857

{
    "code": "US",
    "name": "Hawaii"
}

Here is a Docker image so you can deploy it on your stack.

Note that it performs really well but can be improved a lot, for example the actual Go S2 implementation is still using Rect boxing around loops, that’s why regionagogo is using a data file so it can be generated from the C++ version.

Future

This technique seems to work well for stubbing queries, region queries, geofencing …
It can be a solid foundation to create a flexible and simple geo database.

26 Jan 2016, 14:01

A fast geo database with Google S2 take #2

Six months ago, I wrote on this blog about Geohashes and LevelDB with Go, to create a fast geo database.
This post is very similar as it works the same way but replacing GeoHashes with Google S2 library for better performances.

There is an S2 Golang implementation maintened by Google not as complete as the C++ one but close.

For the storage this post will stay agnostic to avoid any troll, but it applies to any Key Value storages: LevelDB/RocksDB, LMDB, Redis…
I personnaly use BoltDB and gtreap for my experimentations.

This post will focus on Go usage but can be applied to any languages.

Or skip to the images below for visual explanations.

Why not Geohash?

Geohash is a great solution to perform geo coordinates queries but the way it works can sometimes be an issue with your data.

  • Remember geohashes are cells of 12 different widths from 5000km to 3.7cm, when you perform a lookup around a position, if your position is close to a cell’s edge you could miss some points from the adjacent cell, that’s why you have to query for the 8 neightbour cells, it means 9 range queries into your DB to find all the closest points from your location.

  • If your lookup does not fit in level 4 39km by 19.5km, the next level is 156km by 156km!

  • The query is not performed around your coordinates, you search for the cell you are in then you query for the adjacent cells at the same level/radius, based on your needs, it means it works very approximately and you can only perform ‘circles’ lookup around the cell you are in.

  • The most precise geohash needs 12 bytes storage.

  • -90 +90 and +180 -180, -0 +0 are not sides by sides prefixes.

Why S2?

S2 cells have a level ranging from 30 ~0.7cm² to 0 ~85,000,000km².
S2 cells are encoded on an uint64, easy to store.

The main advantage is the region coverer algorithm, give it a region and the maximum number of cells you want, S2 will return some cells at different levels that cover the region you asked for, remember one cell corresponds to a range lookup you’ll have to perform in your database.

The coverage is more accurate it means less read from the DB, less objects unmarshalling…

Real world study

We want to query for objects inside Paris city limits using a rectangle:

h5
Using level 5 we can’t fit the left part of the city.
We could add 3 cells (12 total DB queries ) on the left but most algorithms will zoom out to level 4.

h4
But now we are querying for the whole region.

s2
Using s2 asking for 9 cells using a rectangle around the city limits.

s2 vs h4
The zones queried by Geohash in pink and S2 in green.

Example S2 storage

Let’s say we want to store every cities in the world and perform a lookup to find the closest cities around, first we need to compute the CellId for each cities.

// Compute the CellID for lat, lng
c := s2.CellIDFromLatLng(s2.LatLngFromDegrees(lat, lng))

// store the uint64 value of c to its bigendian binary form
key := make([]byte, 8)
binary.BigEndian.PutUint64(key, uint64(c))

Big endian is needed to order bytes lexicographically, so we can seek later from one cell to the next closest cell on the Hilbert curve.

c is a CellID to the level 30.

Now we can store key as the key and a value (a string or msgpack/protobuf) for our city, in the database.

Example S2 lookup

For the lookup we use the opposite procedure, first looking for one CellID.

// citiesInCellID looks for cities inside c
func citiesInCellID(c s2.CellID) {
  // compute min & max limits for c
  bmin := make([]byte, 8)
  bmax := make([]byte, 8)
  binary.BigEndian.PutUint64(bmin, uint64(c.RangeMin()))
  binary.BigEndian.PutUint64(bmax, uint64(c.RangeMax()))

  // perform a range lookup in the DB from bmin key to bmax key, cur is our DB cursor
  var cell s2.CellID
  for k, v := cur.Seek(bmin); k != nil && bytes.Compare(k, bmax) <= 0; k, v = cur.Next() {
    buf := bytes.NewReader(k)
    binary.Read(buf, binary.BigEndian, &cell)

    // Read back a city
    ll := cell.LatLng()
    lat := float64(ll.Lat.Degrees())
    lng := float64(ll.Lng.Degrees())
    name = string(v)
    fmt.Println(lat, lng, name)
  }
}

Then compute the CellIDs for the region we want to cover.

rect := s2.RectFromLatLng(s2.LatLngFromDegrees(48.99, 1.852))
rect = rect.AddPoint(s2.LatLngFromDegrees(48.68, 2.75))

rc := &s2.RegionCoverer{MaxLevel: 20, MaxCells: 8}
r := s2.Region(rect.CapBound())
covering := rc.Covering(r)

for _, c := range covering {
    citiesInCellID(c)
}

RegionCoverer will return at most 8 cells (in this case 7 cells: 4 8, 1 7, 1 9, 1 10) that is guaranteed to cover the given region, it means we may have to exclude cities that were NOT in our rect, with func (Rect) ContainsLatLng(LatLng).

Congrats we have a working geo db.

S2 can do more with complex shapes like Polygons and includes a lot of tools to compute distances, areas, intersections between shapes…

Here is a Github repo for the data & scripts generated for the images

30 Aug 2015, 18:25

A blazing fast geo database with LevelDB, Go and Geohashes

You probably have heard of LevelDB it’s a blazing fast key value store (as a library not a daemon), that uses Snappy compression.
There is plenty of usages for it, the API is very simple at least in Go (I will be using Goleveldb).

The key is a []byte the value is a []byte so you can “get”, “put” & “delete” that’s it.

I needed a low memory low cpu system that could collect millions of geo data and query over them, Geohash has an interesting property you can encode longitude and latitude into a string : f2m616nn this hash represents the lat & long 46.770, -71.304 f2m616nn, if you shorten the string to f2m61 it still refers to the same lat & long but with less precisions f2m61.
A 4 digits hash leads to 19545 meters precision, to perfom a lookup around a position you simply query for the 8 adjacent blocks. A Geohash library for Go.

Here you would store all of your data points matching a geohash to the same set.
Problem there is no such thing as a set in LevelDB.

But there is a cursor so you can seek to a position then iterate over the next or previous one (byte ordered).
So your data could be stored that way: 4 digits geohash + a uniq id.

Then you can perform proximity lookup by searching for the 8 adjacents hashes from the position your are looking with a precision of 20km, good but not very flexible.

We can have a more generic solution, first we need a key a simple int64 uniq id.

// NewKey generates a new key using time prefixed by 'K'
func NewKey() Key {
	return NewKeyWithInt(time.Now().UnixNano())
}

// NewKeyWithInt returns a key prefixed by 'K' with value i
func NewKeyWithInt(id int64) Key {
	key := bytes.NewBufferString("K")
	binary.Write(key, binary.BigEndian, id)
	return key.Bytes()
}

Here we can encode a key with a Unix timestamp so our key is not just a key it’s also an encoded time value, it will be uniq thanks to the nanosecond precision. We are using BigEndian so it can be byte compared: older will be before newer after.

Now about geo encoding our key will be of the form:
G201508282105dhv766K��Ϸ�Y� (note the end of the key is binary encoded) You always need a prefix for your keys so you can seek and browse them without running over different keys types, here I have a G as Geo, then a string encoded date prefix, so we can search by date, but we don’t want extra precision here, it would add extra seek to LevelDB, (that’s why we have a modulo of 10 for minutes) then we add a precise geohash and finally our previous uniq id.

// NewGeoKey generates a new key using a position & a key
func NewGeoKey(latitude, longitude float64) GeoKey {
	t := time.Now().UTC()
	kint := t.UnixNano()
	kid := NewKeyWithInt(kint)
	// G + string date + geohash 6 + timestamped key 
	// G201508282105dhv766K....
	gk := geohash.EncodeWithPrecision(latitude, longitude, 6)
	ts := t.Format("2006010215")

	// modulo 10 to store 10mn interval
	m := t.Minute() - t.Minute()%10
	zkey := []byte("G" + ts + fmt.Sprintf("%02d", m) + gk)
	zkey = append(zkey, kid...)
	return zkey
}

We can now lookup by flexible date & by flexible proximity like a Redis ZRANGE, you simply need to reverse the process.

// GeoKeyPrefix return prefixes to lookup using a GeoKey and timerange
func GeoKeyPrefix(start, stop time.Time) []string {
	var res []string
	d := 10 * time.Minute
	var t time.Time
	t = start
	for {
		if t.After(stop) {
			break
		}

		key := "G" + t.Format("2006010215") + fmt.Sprintf("%02d", t.Minute()-t.Minute()%10)
		res = append(res, key)
		t = t.Add(d)
	}
	return res
}

Lookup that way:

	d := time.Duration(-10) * time.Minute
	geoPrefixs := GeoKeyPrefix(time.Now().UTC().Add(d), time.Now().UTC())

	// find adjacent hashes in m
	// 1, 5003530
	// 2, 625441
	// 3, 123264
	// 4, 19545
	// 5, 3803
	// 6, 610
	gk := geohash.EncodeWithPrecision(lat, long, 4)
	adjs := geohash.CalculateAllAdjacent(gk)
	adjs = append(adjs, gk)

	// for each adjacent blocks
	for _, gkl := range adjs {

		// for each time range modulo 10
		for _, geoPrefix := range geoPrefixs {
			startGeoKey := []byte(geoPrefix + gkl)
			iter := s.NewIterator(util.BytesPrefix(startGeoKey), nil)

			for iter.Next() {
				log.Println(iter.Value())
			}
			iter.Release()
		}
	}

It can be optimized, reducing the size of the keys, but it performs extremely well storing around 3 millions geopoints per day, using less than 3% cpu and can received hundreds of queries per second.

Oh did I forget to mention it’s running on a Raspberry Pi? :)

I could maybe turn it into a library but it’s so simple it’s probably useless.
Next blog post: what are those millions points used for?

02 Aug 2015, 13:10

Access OS metrics from Golang

I’ve recently published StatGo, it gives you access to your operating system metrics like free memory, used disk spaces …

It’s a binding to the C library libstatgrab, a proven stable piece of code that works on many different systems, FreeBSD, Linux, OSX …

It’s very simple to use:

s := NewStat()
c := s.CPUStats()
fmt.Prinln(c.Idle)
98.2

Feel free to contribute, it may need some improvement but it’s working I’m using it in a small metrics web server to monitor small network of servers.

The code is on Github

02 Apr 2015, 12:49

A 10 minutes walk into Grafana & Influxdb

This is a 10 minute tutorial to set up an InfluxDB + Grafana with Go on your Mac, but should work with minor modifcations on your favorite Unix too, it assumes you already have a working Go compiler.

InfluxDB is a database specialized into time series, think store everything associated with a time, makes it perfect for monitoring and graphing values. Grafana is a js frontend capable of reading the data from InfluxDB and graphing it.

brew install influxdb

Start InfluxDB, and then point your browser to http://localhost:8083 default user is root, password is root and default port is 8086.

influxdb -config /usr/local/etc/influxdb.conf Create a database called test.

Let’s test the connection with the db and Go, first install the InfluxDB driver for Go:

go get github.com/influxdb/influxdb/client Test your setup with some code:

package main

import (
    "fmt"

    "github.com/influxdb/influxdb/client"
)

func main() {
    c, err := client.NewClient(&client.ClientConfig{
        Username: "root",
        Password: "root",
        Database: "test",
    })

    if err != nil {
        panic(err)
    }

    dbs, err := c.GetDatabaseList()
    if err != nil {
        panic(err)
    }

    fmt.Println(dbs)
}

If you are good you should see a map containing all your created InfluxDB databases.

Now let’s measure something real: the time it takes for your http handler to answer.

package main

import (
    "fmt"
    "log"
    "math/rand"
    "net/http"
    "time"

    "github.com/influxdb/influxdb/client"
)

var c *client.Client

func mySuperFastHandler(rw http.ResponseWriter, r *http.Request) {
    start := time.Now()
    // sleeping some random time
    rand.Seed(time.Now().Unix())
    i := rand.Intn(1000)
    time.Sleep(time.Duration(time.Duration(i) * time.Millisecond))
    fmt.Fprintf(rw, "Waiting %dms", i)
    t := time.Since(start)

    // sending the serie
    s := &client.Series{
        Name:    "myhostname.nethttp.mySuperFastHandler.resp_time",
        Columns: []string{"duration", "code", "url", "method"},
        Points: [][]interface{}{
            []interface{}{int64(t / time.Millisecond), 200, r.RequestURI, r.Method},
        },
    }
    err := c.WriteSeries([]*client.Series{s})
    if err != nil {
        log.Println(err)
    }
}

func main() {
    var err error
    c, err = client.NewClient(&client.ClientConfig{
        Username: "root",
        Password: "root",
        Database: "test",
    })
    if err != nil {
        panic(err)
    }

    http.HandleFunc("/", mySuperFastHandler)
    http.ListenAndServe(":8080", nil)
}

This is not very useful as it’s measuring the time to write to the ResponseWriter that’s why I’ve added some random time but you get the sense. It will save a serie per request as: duration, status code, url, http method, the name of the serie is important as many tools (as Graphite) are using the dots as separator, so think twice before naming your serie. Point your browser to http://localhost:8080 and reload the page several times.

Now that we have data let’s browse them with the InfluxDB browser, go to the InfluxDB admin and hit “explore data” and select with:

SELECT duration FROM myhostname.nethttp.mySuperFastHandler.resp_time WHERE code = 200;

Image Alt

You should be able to see the inserted data points.

Now let’s work with Grafana, download the tar gz, uncompress it somewhere, copy this demo config.js file in the root directory of Grafana. Go to the InfluxDB admin with your browser and add a new database called “grafana”.

In your web browser, open the file index.html in the Grafana directory, you should see a the Grafana interface edit the default graph, enter the query as follow:

  • click on series it will complete with myhostname.nethttp.mySuperFastHandler.resp_time
  • In alias type $0 $2, it will use the 1st part and the 3rd part of the name (remember the dots) so it will display myhostname mySuperFastHandler
  • Finally click on mean and choose duration in the completion, then add code = 200 as where clause.

Hit save and you are done !

Image Alt

There is so much more you can do with InfluxDB & Grafana, it’s really simple to collect and display, hope you want to go further after this. You can look at my generic net/http handler for InfluxDB on Github that can be integrated into your code.