An investigative walk-through of Go's channels

This is a summary of a talk I gave during Gophercon Vietnam & Gophercon India. If you have seen this talk already, you might find the concepts crystallized after reading the blog post. You can find the slides here.

I have split this blog post, similar to the way I had structured the talk:

  • A simple scraper walk-through
  • Deep dive into Go's channels
  • Runtime execution and demo

#Simple scraper walk-through

In this walk-through, I will be talking about a simple program to scrape from Amazon's product listing page. All the code is available here.

Product Listing

Here, we are scraping data into a struct which looks like:

type Product struct {
	Name    string
	Link    string
	Image   string
	Price   string
	Reviews []Review
}

In this program, I am using a simple scraping & parsing library called soup.

The algorithm can be summarized as:

As a result, the snippet of code below,

Main Code

produces the following output:

{
  "Name": "Sanyo 108.2 cm (43 inches) Full HD IPS LED TV XT-43S7100F (Black)",
  "Link": "https://www.amazon.in/Sanyo-108-2-inches-XT-43S7100F-Black/dp/B01ICVLK4S/ref=lp_1389396031_1_5/260-5276449-2035623?s=electronics&ie=UTF8&qid=1540385124&sr=1-5",
  "Image": "https://images-eu.ssl-images-amazon.com/images/I/51F0yKjVX1L._AC_US218_.jpg",
  "Price": "17,990",
  "Reviews": [
    {
      "Name": "Cooper Vrf",
      "Rating": "5.0 out of 5 stars",
      "Content": "Much has been said about the clarity and performance of the TV. Yes its a great value for money product, but here I am going to say about the customer service part. After using my 49 inch TV for about 21 months one day my TV went black. I could hear only the sound but no picture. I emailed the Sanyo customer care and within four hours a service person called me and fixed the appointment for very next day. The next day he inspected the TV and told that the panel has gone dead and they will have to change the panel. As my TV was under 2 year warranty he assured me that the company will possibly replace the panel and it could be done within 10 days. I was very much skeptical about the panel replacement as the next day I received an email from the company saying that they will be taking all possible actions to solve my query but no mention of panel replacement. After about a week I received a call from their service center in Bilimora that my tv has arrived and the service person is coming to install it. Even though it was raining heavily on that day he came right on time and to my amazement he brought out from his car a whole new TV box. Believe me the company has replaced a whole new tv in place of the old one, only the two boards from my old tv were fixed on the new one. Really hats off to the Company's customer service and special mention to their service centre in Bilimora which got the job done on time even in heavy rainfall and even though my hometown is about 25 kms away. Within the next two hours I received three calls from the company regarding my query solved. Thanks Sanyo..... Really satisfied on buying your company's TV."
    },
    {
      "Name": "Debasish Sen",
      "Rating": "4.0 out of 5 stars",
      "Content": "Great clarity. Decent sound. A cracking deal at this price."
    },
    {
      "Name": "Naziruddin",
      "Rating": "4.0 out of 5 stars",
      "Content": "A good tv for the exchange price.Good reception n colours.Gurantee card does not say 10 years tv panel warranty but mentioned on the web site of the Tv,hopefully it's  true. Worth buying."
    }
  ]
}

The above code has room for out-of-order execution, this means we could use some of Go's concurrency paradigms like channels and goroutines to make the code run faster. This is what the code looks like now:

Concurrent scraper

#Deep dive into Go's channels

Go has two channel implementations:

  • buffered ch := make(chan Product, 3)
  • unbuffered ch := make(chan Product)

They both make use of the same hchan struct under the hood.

#What's in a hchan struct?

Hchan struct

The information in a hchan struct can be broadly grouped as:

  • Type information
  • Buffers & queues
  • Counters
  • Locks & flags

Let's look at each of them in detail:

#Type information

  • elemsize

    Represents the size of a single element in memory in bytes. For eg, our Product struct takes 88 bytes (string takes 16 bytes, Review takes 24 bytes) in the stack. Type int takes 8 bytes.

  • elemtype

    Used when messages are copied over from one goroutine to the other. It has a bunch of fields which provide type and size information for the type of values the channel can hold. There is a bit of overlap with elemsize.

    Elem Type

#Buffers & queues

#Buffers

  • buf

    This is a circular queue. With a buffered channel of size 3, ch := make(chan Product, 3), buf will look as follows:

    buffer Illustration

    Essentially make(chan Product, 1) != make(chan Product), because when channel size is defined a circular queue implementation is used internally to hold the elements of a channel.

#Queues

  • recvq & sendq

    Before we deep dive into these two fields, let's talk about sudog struct:

    sudog struct

    sudog represents a goroutine. recvq & sendq hold the information on the goroutines which are currently blocking on the channel while receiving and sending respectively. They are pointers to sudog.

    Taking recvq as an example:

    recvq Structure

    Image credits: Ankur Anand

    sudog is designed as a doubly linked list. It has pointers to the prev and next goroutines which are blocking on the channel. It also contains an unsafe pointer to elem, the element which the goroutine is currently blocking on.

    As and when the channel is ready, the next goroutine to run is picked from the recvq and sendq by the Go's scheduler.

#Counters

  • dataqsiz

    This represents the entire buffer size of the channel, as specified by the user. Essentially, the queue capacity.

  • qcount

    This represents the number of slots in the circular queue currently filled up.

  • sendx & recvx

    These are indices on the circular queue, buf to figure out where to next dequeue from (recvx) or enqueue to (sendx).

    Let's look at an alternative representation of the buffered channel and let's go through a few scenarios to understand the behavior of sendx and recvx.

Kavya's Buffered Queue

Image credits: Kavya Joshi

When the channels is empty: Kavya's Buffered Queue

Image credits: Kavya Joshi

After an enqueue happens: Kavya's Buffered Queue

Image credits: Kavya Joshi

After the channel becomes full, note the dataqsiz & qcount become equal. All sends to this channel will block at this point. Kavya's Buffered Queue

Image credits: Kavya Joshi

Let's dequeue one element: Kavya's Buffered Queue

Image credits: Kavya Joshi

#Flags & locks

  • closed

    This is a flag which shows whether the channel is currently closed or not. A channel can be closed by any of the goroutines which have access to the channel.

  • lock

    This is a mutex which controls access to the channels. Channels are goroutine safe due to the use of this lock. It controls receives and sends to the channel, ensuring no race condition happens.

This was all you need to know to under the following demo.

#Runtime execution & demo

Please watch the following video to catch channels live in action.

#Summarizing channels

  • FIFO (first-in, first-out) - recvq/sendq

  • Do not communicate by sharing memory; instead, share memory by communicating.1

  • Thread Goroutine safe

  • First-class citizens!

#References




Latest Posts