Communicating With Apple Watch

I was listening to the lastest episode of the Talk Show with guest Joanna Stern about the Apple Watch. During one section they started talking about tap as a means of communication, and the impact this might or might not have. As John wrote in his review of the watch, it's not that hard to imagine at least a few scenarios where, for example, sharing a heartbeat would be novel, intimate, and gain even widespread use.

Having thought about this more, I do wonder if perhaps the ability to effectively touch someone from afar will turn out to be a big deal. I think it will.

You can imagine that a native SDK app might even be able to take input from one person and send that to another, enabling a morse-code method of communicating, for example.

There's a lot more to consider here, but it seems like it's a bigger deal than I thought it might be at first.

It also makes me wonder just how much more intrusive it will feel to be tapped by some spammy notification in an app.

On Apple Watch

Up to this point I haven't been sold (personally) on Apple Watch. The main drawbacks as I saw them were:

  1. Price. And it's a recurring price since you know you'll have to upgrade every year.
  2. Size. I have small wrists and I don't like large watches. I don't even always wear a watch. I don't want to wear a huge piece of jewelry on my wrist.
  3. Battery. I don't want to charge something every night, especially when it would otherwise have utility (sleep tracking).
  4. Utility. What in the world is the Apple Watch (or any smart watch) going to do for me that I care about?

However, having listened to quite a few podcasts on the topic and read even more posts, I'm convinced now that (at least at some point) I'm going to want one.

  1. I'll get over the price. And if the main SDK components remain the same then there isn't that much computation being done on the watch itself. The year-over-year upgrade may not turn out to be so compelling. And if your band can last more than 2 years, you'd only have to upgrade the watch, not the band.
  2. I'll get over the size. Everyone will have one. It won't be so weird.
  3. I'll get over the battery. Because there will be so much utility. Which leads me to...
  4. Utility. I'm now convinced of enough positive use cases that I think it would really be a helpful device.

A few use cases for Apple Watch:

  • When there's motion on my front porch my wrist can tap and I can see a picture from my porch camera to see who/what is there. Same with other household security notifications.

  • When I get home I can open the garage door via the app I wrote to control my garage door remotely. Why would you want to do this? Because your watch knows that you are you, and theives like to steal garage door openers and use them to get into your stuff. It'd be safer to not even carry one.

  • When driving and I get a text message I can easily glance at my wrist, see that it's something I care about (or not), and respond via Siri, without having to find my phone and get it out.

  • My wife can find her phone in the house when she loses it.

  • Easier interface to Siri.

Since my watch can know that it's me, and therefore verify that I am, in fact, me, then it can be a presence notifier on my behalf. This leads to some pretty great conclusions:

  • My car can unlock the doors as I walk up, and let me start the car without another key. And I don't need a massive fob in my pocket to let me do this.

  • Same with my front door (though I'm unsure I'd ever opt for a lock like this).

  • Turning off the lights when we leave the house and otherwise left them on.

  • Allowing me to verify myself as other services support such features. Things like Apple Pay are already there, but other sorts of checkin, registration, and verification could all be linked as well.

I'm not currently planning on getting one immediately, but I think I am far more likely to purchase one than I was a few months ago.

The History of English Podcast

I was recently made aware of the great History of English Podcast.

The author, Kevin Stroud, while not a professional linguist, is a wonderful story teller. Each episode covers a combination of the etymologies of English as well as the history, that is the people, who have spoken this language in its various forms from it's roots thousands of years ago.

I'm still playing catchup, but even after less than 10 episodes in I'm finding that it has firmly drawn my attention to words, explaining how various words came into English, and why even words with similar meanings sounds so different from one another, i.e. why horse as well as equine.

If you have any interest in history and even a passing curiosity about the English language I would highly recommend that you check it out.

Handling Motion JPEG Streams on iOS

I have several Foscam Cameras around the outside of my house. They're very easy to setup, tolerate the outdoor conditions admirably, and are incredibly affordable for what they offer.

As with everything else around my house, I like to build software that customizes my view into my home (or in this case, outside my home). To that end I've build an app I call Argos that lets me monitor all sorts of sensors on my property.

Once I installed the first set of cameras I wanted to be able to implement some views that would display the current video stream from each camera. After looking into the documentation I discovered that the cameras I have offer two types of video streams: windows streaming video (asf) and motion JPEG.

I don't have a lot of experience writing software to handle video streams. But as I read the basic description it seemed that a motion JPEG stream is just an http stream that continually pushes out a series of jpeg images.

Oh, well that's easy. Right?

Well, not so fast.

What Is Motion JPEG?

It also turns out that there is no such thing as a true motion jpeg standard. However, there are two typical implementations, Motion JPEG-A and Motion JPEG-B. Motion JPEG-A supports the concept of markers, while Motion JPEG-B does not. This difference is important. For the rest of this discussion however all we need to know is that the Foscam camera stream is Motion JPEG-A.

A Motion JPEG-A stream looks (to me) a lot like a multipart email message. There are several sections, each separated by a long string of semi-random characters. Within each section is some encoded (or not) binary data that represents the object in that section. In our case, each section is a JPEG image.

We can see what this looks like by using the curl command:

{% highlight ruby %} › curl -D - "http://192.168.300.301/videostream.cgi?user=admin&pwd=SECRETS" HTTP/1.1 200 OK Server: Netwave IP Camera Date: Wed, 02 Jul 2014 22:28:03 GMT Accept-Ranges: bytes Connection: close Content-Type: multipart/x-mixed-replace;boundary=ipcamera

--ipcamera Content-Type: image/jpeg Content-Length: 43996

????JFIF???!???

[a lot of binary data]

--ipcamera Content-Type: image/jpeg Content-Length: 44176

????JFIF???!???

[a lot more binary data] {% endhighlight %}

It just goes on and on like this.

We can see in the header of the response that the boundary text will be ipcamera, and the two lines following each boundary include the content type and the content length.

So How Do We Parse This?

This is the basic approach to parsing a data stream like this:

  1. Read in the first chunk of data
  2. Does the chunk contain a boundary marker?
  3. If so, is that boundary marker the first boundary marker?
  4. If it is the first one, then skip it.
  5. Is there another marker? If so, then we have a complete image.
  6. If we have a complete image, find the start and end of the image, remove those from our buffer, and process the image.
  7. If we do not yet have a complete image, append the data to the buffer, and wait for the next chunk of data.

The key here is that we never know how many chunks it will take to make one image. In an ideal world we'd just get one chunk per image and we could throw that right into an NSData object and convert it to a UIImage.

Here's the code I have so far for parsing the Motion JPEG stream:

The heart of the code is in func URLSession(session: NSURLSession!, dataTask: NSURLSessionDataTask!, didReceiveData: NSData!). That's where we attempt to see if we've hit the end of an image, and if so, extract it from the buffer.

Bugs

So far the code works fairly well, except that from time to time when I attempt to make a UIImage out of this I get a failure. I'm not sure if my data out of my camera is bad (unlikely) or if I'm just messing up the process of extracting the data (much more likely).

Improvements

What I'm currently not doing, but probabaly should be doing, is using the Content-Length header to verify the length of the image data before passing it off. I do wonder if that wouldn't be a far more reliable way to extract the data from the buffer.

Future

Beyond cleaning up the code a bit and trying to make it more reliable, I would love for this view to include some other nice features down the road, like the gesture recognizers to allow me to implement panning/tilt via gesture. Several apps dedicated to IP Camera viewing do this, and it wouldn't be very difficult at all to get it right.