08.31.2009 16:12

AIS time and matplotlib

I have been working more on AIS time. In the process, I pushed gnuplot over the edge. I have been successfully using gnuplot since 1996, but it is not the most flexible plotting program. I'm forcing myself to use matplotlib and not slide back into gnuplot. Here is my latest attempt with matplotlib.
#!/usr/bin/env python

from numpy import array
import matplotlib.pyplot as plt
import sqlite3

xmin,xmax = 210000,220000

cx = sqlite3.connect('ais.db3')

pos_ok   = array( [tuple(row) for row in cx.execute('SELECT key,(est_sec) FROM position WHERE est_sec IS NOT NULL;')] ).T
pos_fail = array( [tuple(row) for row in cx.execute('SELECT key,(cg_sec) FROM position WHERE est_sec IS NULL;')] ).T

bs_ok    = array( [tuple(row) for row in cx.execute('SELECT key,.1 FROM bsreport WHERE est_sec IS NOT NULL;')] ).T

pos_sec  = array( [tuple(row) for row in cx.execute('SELECT key,timestamp FROM position WHERE timestamp < 60;')] ).T

fig = plt.figure()
ax1 = fig.add_subplot(2,1,1)
ax1.plot(pos_fail[0],pos_fail[1],'b+', pos_ok[0],pos_ok[1],'ro')

ax2 = ax1.twinx()
ax2.plot(bs_ok[0], bs_ok[1], 'gx')

ax2.set_ylim(0,10) #ymin, ymax  # For plotting on top of the same graph, but with a different axis

ax3 = fig.add_subplot(2,1,2)


I've cheated a bit and used photoshop to add labels to this plot, but it is great to be able to add plots with different y-axes on the same graph/subplot.

Posted by Kurt | Permalink

08.31.2009 15:08

Fink and Mac OSX 10.6

I have my copies of Mac OSX 10.6 on order, but I am bummed by the lack of proper support Apple is giving X11. X11 is the number one application on the Mac that I use. I also saw that PPC support has gone away. I wonder if old PPC applications will still run. Hopefully this will all shake out quickly with good results.

I am looking forward to 64-bit fink, but wonder how many of the packages I use on a daily basis will require serious effort to get back working.

Monipol has a post about migrating your fink setup from an older version to a newer version: Starting anew

and he has got a link to this video about fink...

Posted by Kurt | Permalink

08.31.2009 12:59

NOAA report on the US East Coast Sea Level Anomaly

I won't have time to really dig into this, but I have to wonder how this impacts the definition of primary stations used for hydrographic surveying.

NOAA Report Explains Sea Level Anomaly this Summer along the U.S. Atlantic Coast [noaa.gov]

NOAA Technical Report NOS CO-OPS 051; ELEVATED EAST COAST SEA LEVELS ANOMALY: July - June 2009 by Sweet, Zervas, and Gill (38 pp)

Posted by Kurt | Permalink

08.27.2009 15:30

The Healy Aloftcon is now in Google Oceans

The USCGC Healy ice breaker is now in Google Oceans. This is my second contribution to the content available directly in Google Earth. Google is pulling from my GeoRSS feed of the Aloftcon. First you have to enable the layer.

You then have to zoom into the area where the Healy is currently located and the icon will pop into existance. Right now that is 81.17N 151.196W

Here is my version of the Healy Aloftcon tracks.

Posted by Kurt | Permalink

08.26.2009 23:07

Finding out about your mac

There are lots of ways that you can query a Mac about the hardward and software installed on the machine. The old standby is uname will tell us the hardware (PPC or intel x86) and the kernel version:
uname -a
Darwin CatBoxIII 9.7.0 Darwin Kernel Version 9.7.0: Tue Mar 31 22:52:17 PDT 2009; root:xnu-1228.12.14~1/RELEASE_I386 i386
And on a G5 Power Mac tower
uname -a
Darwin eel.ccom.nh 9.8.0 Darwin Kernel Version 9.8.0: Wed Jul 15 16:57:01 PDT 2009; root:xnu-1228.15.4~1/RELEASE_PPC Power Macintosh
ifconfig will tell you the MAC addresses of your ethernet, wifi, and blue network interfaces.

Then come the Mac OSX specific tools...

softwareupdate --list will tell you about what will be updated if you do sudo softwareupdate -i -a
softwareupdate --list

Software Update found the following new or updated software:
   * RemoteDesktopAdmin33-3.3
        Remote Desktop Admin Update (3.3), 57530K [recommended]
   * Safari403Leo-4.0.3
        Safari (4.0.3), 41443K [recommended] [restart]
   * MacOSXUpd10.5.8-10.5.8
        Mac OS X Update (10.5.8), 169217K [recommended] [restart]
And about xcode:
xcodebuild -version
code 3.1.2
Component versions: DevToolsCore-1148.0; DevToolsSupport-1102.0
BuildVersion: 9M2621
About the version of Mac OSX:
ProductName:    Mac OS X
ProductVersion: 10.5.7
BuildVersion:   9J61
And the kitchen sink of harward info ... try this one without the "|head" and be prepaired for many pages of info.
system_profiler | head

    Hardware Overview:

      Model Name: MacBook Pro
      Model Identifier: MacBookPro3,1
      Processor Name: Intel Core 2 Duo
      Processor Speed: 2.4 GHz
      Number Of Processors: 1
      Total Number Of Cores: 2
ioreg shows the "I/O Kit registry" meanining? But it has the key you need for checking your warrenty on the mac:
ioreg -l | grep IOPlatformSerialNumber
    |   "IOPlatformSerialNumber" = "W**********G"
Then put in that number at https://selfsolve.apple.com/GetWarranty.do

For updating hardware inventories for our IT team, I do this. Maybe I should add a few like the warrenty info.
hostname=`uname -n`
ip=`ifconfig en0 | grep inet | awk '{print $2}'`
serial=`system_profiler -detailLevel basic SPHardwareDataType | grep 'Serial Number:' | awk '{print $3}'`
model=`system_profiler -detailLevel basic SPHardwareDataType | grep 'Model Name:' | cut -c 7- `
If you have fink installed, you can do these:
fink -V | head -n2
Package manager version: 0.29.8
Distribution version: selfupdate-cvs Tue Aug 25 22:08:51 2009, 10.5, i386
List installed packages with fink list -i. List out of date packages with fink list -o. You can list files in an installed package:
dpkg -L tar | head
As which fink package owns a file: pkg -S /sw/bin/dvips ptex-base: /sw/bin/dvips Tell me all the processes running on the machine in great detail:
ps -a -w -w -x -O user | head
    1 root       ??  Ss     0:01.03 /sbin/launchd
   15 root       ??  Ss     0:00.98 /usr/libexec/kextd
   16 root       ??  Rs     0:11.31 /usr/sbin/DirectoryService
   17 root       ??  Ss     0:02.07 /usr/sbin/notifyd
   18 root       ??  Ss     0:01.03 /usr/sbin/syslogd
   19 root       ??  Ss     0:21.04 /usr/sbin/configd
   20 daemon     ??  Ss     0:03.82 /usr/sbin/distnoted
   21 _mdnsresponder   ??  Ss     0:01.55 /usr/sbin/mDNSResponder -launchd
   26 root       ??  Ss     0:00.45 /usr/sbin/securityd -i
How hard is the machine working and when was it last rebooted?
23:30  up 3 days,  3:31, 1 user, load averages: 0.36 0.24 0.23
List open ports visible on the machine locally (not blocked by the outward facing firewall) and the versions. I have hidden some of the less interesting lines.
fink install nmap
nmap -A localhost
22/tcp    open  ssh          OpenSSH 5.1 (protocol 2.0)
80/tcp    open  http         Apache httpd 2.2.9 ((Unix) mod_python/3.3.1 Python/2.6.2 mod_wsgi/2.3)
88/tcp    open  kerberos-sec Mac OS X kerberos-sec
548/tcp   open  afp          Apple AFP (name: CatBoxIII; protocol 3.3; Mac OS X 10.5)
631/tcp   open  ipp          CUPS 1.3
5432/tcp  open  postgresql   PostgreSQL DB
6000/tcp  open  X11          (access denied)
8001/tcp  open  ssh          OpenSSH 5.1 (protocol 1.99)  Note: This is actually a tunnel to another machine
8080/tcp  open  http         Apache httpd 2.0.52 ((CentOS))  Note: This is actually a tunnel to another machine
50001/tcp open  unknown
1 service unrecognized despite returning data. If you know the service/version, please submit the following fingerprint 
Running: Apple Mac OS X 10.5.X
OS details: Apple Mac OS X 10.5 - 10.5.6 (Leopard) (Darwin 9.0.0 - 9.6.0)
List all open files by any program:
sudo lsof | head -v 3
loginwind    40 schwehr  cwd      DIR       14,2      1394         2 /
loginwind    40 schwehr  txt      REG       14,2    946848  29312506 /System/Library/CoreServices/loginwindow.app/Contents/MacOS/loginwindow
loginwind    40 schwehr  txt      REG       14,2    600128  29320396 /System/Library/PrivateFrameworks/Admin.framework/Versions/A/Admin
What is taking all my cpu?
top -o cpu -s 2
Processes:  110 total, 3 running, 1 stuck, 106 sleeping... 338 threads                            23:35:17
Load Avg:  0.11,  0.15,  0.18    CPU usage:  3.37% user,  6.07% sys, 90.56% idle
SharedLibs: num =    9, resident =   87M code, 2944K data, 5224K linkedit.
MemRegions: num = 14597, resident =  553M +   33M private,  364M shared.
PhysMem:  598M wired, 1413M active,  207M inactive, 2225M used, 1871M free.
VM: 11G + 376M   466472(1) pageins, 0(0) pageouts

12986 firefox-bi   4.0%  7:13.11  14   207    981  112M+   27M-  158M   545M
    0 kernel_tas   3.8% 21:44.84  67     2    526 8048K      0   224M   351M
14113 top          3.7%  0:00.29   1    18     29  652K   200K  1248K    19M
   74 WindowServ   2.7% 12:12.73   5   321   1024  -15M+  107M+   93M   464M
How much disk space do I have on the disk I am currently in:
df -h .
Filesystem     Size   Used  Avail Capacity  Mounted on
/dev/disk0s2  149Gi  139Gi  8.9Gi    95%    /
Thanks to Alexander Hanson for triggering this post by mentioning sw_ver and xcodebuild that I didn't know. I think that's enough for now.

Update 2009-Sep-01: Monipol sent me another suggestion:
sysctl -a # Shows a whole lot of info
sysctl hw.ncpu # Name a specific variable and get the results (here number of cpus.

See also: Monipol on fink bug reports

Posted by Kurt | Permalink

08.26.2009 17:27

Unmanned arial vehicle observing fires

Posted by Kurt | Permalink

08.26.2009 14:33

Detecting duplicate AIS packets in the database

The last release of noaadata contained ais_nmea_remove_duplicates that I talked about a couple weeks ago. This program kept a running list of packets and used the NMEA data payload string has a hash. That was nice in that it dropped the data volume down as early in the process as possible. However, with the timing issue, I want to keep the duplicate packets around. This will allow me to compute the USCG timestamp skew, which is non-monotonic and jumpy. More importantly, when I have my time sense figured out with my new algorithms, I should be able to compare each receivers sense of time. Hopefully, I can then put together the best possible set of position reports and time for each vessel.

ais_build_sqlite now has a new class called "TrackDuplicates". It keeps a queue of recent packets. The list is constrained both by length and time. If I know that my time source is good to a couple minutes, I can constrain the queue to be 5 minutes and I should be able to catch all of the duplicate packets. When a new packet is received, it is assigned a pkt_id. When any additional duplicate packets come in, the tracker will return a duplicate indicator and the pkt_id of the original packet. I keep the pkt_id unique across all of the message types so that I can keep things in order for time analysis that includes both basestations and class A transceivers.

Here are the new columns in the database. Currently, I only check messages 1-4.
ALTER TABLE position ADD pkt_id INTEGER; 
ALTER TABLE position ADD dup_flag BOOLEAN;
ALTER TABLE bsreport ADD pkt_id INTEGER; 
ALTER TABLE bsreport ADD dup_flag BOOLEAN;
Here is a quick example of what the code does:
SELECT COUNT(*) duplicates FROM position WHERE dup_flag=1;
SELECT COUNT(*) duplicates FROM position WHERE dup_flag=0;
SELECT key,userid,longitude,latitude,pkt_id,dup_flag FROM position;
0 367110280 -70.9717683333 42.2428966667 1 0
1 367029470 -71.03155 42.36275 2 0
2 304887000 -70.5767166667 41.7548 3 0
3 367138830 -71.04781 42.3813933333 4 0
4 366883240 -70.9786833333 42.3332166667 5 0
5 368504000 -70.6958666667 41.6759383333 6 0
6 366976890 -71.0473816667 42.38136 7 0
12 367327910 -70.5932983333 42.4034733333 10 0
13 367327910 -70.5932983333 42.4034733333 10 1
14 367327910 -70.5932983333 42.4034733333 10 1
15 367078250 -71.0449 42.3726333333 11 0
16 366999985 -70.9762733333 42.338805 12 0
17 366999985 -70.9762733333 42.338805 12 1
And now for basestations:
SELECT key,userid,Position_longitude as lon,Position_latitude as lat,pkt_id,dup_flag FROM bsreport;
7 3669713 -71.0531116667 42.368345 8 0
8 3669713 -71.0531116667 42.368345 8 1
9 3669971 -70.9544866667 42.3467866667 9 0
10 3669971 -70.9544866667 42.3467866667 9 1
11 3669971 -70.9544866667 42.3467866667 9 1

Posted by Kurt | Permalink

08.26.2009 14:32

CCOM/JHC Flickr Photo Stream

Colleen has just released the CCOM/JHC flickr photostream. It has some gems like this image of the CCOM tiled display showing Portsmouth, NH in GeoZui being controlled with the WiiMote (hats off to Roland for putting this together for a UNH GIS Day Event).

Posted by Kurt | Permalink

08.26.2009 11:26

USCGC Healy Flickr Feed

The USCGC Healy ice breaker has Flickr Photostream

Thanks to Colleen for pointing me to the feed.

Posted by Kurt | Permalink

08.25.2009 19:20

Finding time outliers in an ais feed

Here is my attempt at an algorithm to find the time outliers in a stream of packets. The goal is to throw out packets where the sense of time is corrupted. This is the same Boston area data that has one ship that completely looses its sense of time that I showed a few days ago. First some results. Red x's for points that were rejected, green +'s for points that I kept. This is only the points where I have the commstate for positions or valid basestation reports.

Zooming in to a window of about 400 second period of time shows how the algorithm does a pretty good job removing packets that are close, but not quite on the general time curve. It looks like I have one time record that I should have removed, but I think that will go away with some tuning.

Here is my algorithm for each packet in the sequence:

Go through all the point and mark as invalid any time that is more than minutes away from the timestamp. I have seen errors of up to 80 seconds in the datalogging systems timestamping.

From here on forward, I only consider packets that have not already been flagged as invalid.

I then walk the data in order received from one station. Look back over the last N points. These should be all have times that are older. Stop looking at packets when they are more than 60 seconds older than the current packet. Count how many are newer than a buffer time (using 10 seconds now). If I get more than a critical number of future packets, then I flag the point as invalid.

I then flip this around and look forward. This is a bit more risky. An unflagged bad point could stop my search before I realize that it is bad by triggering the search to stop by being too far in the future. Packets ahead in the stream (received in a later slot) should have newer times. If we see too many old packets, then this packet thinks it is newer than it really is. I am wondering if I should look backwards across the whole data set first, rather alternating looking backward and forward.

Right now, I only look at the next 10 valid points in each direction and clip this if the data is too sparse in time; I only scan up to 200 received messages out and up to 60 seconds. With the 10 second delta time parameter, that gives me 4 degrees of freedom to tune. Here is a quick illustration of looking to either side in the data. They show the direction in which they were found to be bad.

Posted by Kurt | Permalink

08.25.2009 06:40

More Space Based AIS (S-AIS)

This month's Digital Ship has an article about Space-based AIS (S-AIS). The article focuses on COM-DEV International's exactEarth subsidiary and their exactAIS product.
"Our system, proven in orbit, can detect 1,000 ships in less than 90
seconds," said Mr Mabson.
COM DEV says that a satellite picking up AIS transmissions can see an
area of about 5,000km at any one time, covering that space for about
10 minutes.
"AIS has been an extremely successful maritime implementation, there
are now more than 70,000 AIS transponders on ships, the system is
rugged, there are many manufacturers of transponders and it's a
well-proven system," he said.
I haven't seen any data from COM DEV to be able to judge the system. Then end goal is apparently is total of 6 small satellites.

Posted by Kurt | Permalink

08.24.2009 12:58

AIS time from position messages

I've been working on how best to produce a sense of time for received AIS messages where the time stamps on the receiver either do not exist or are untrustworthy. One thing that I assume for an individual receiver is that the messages come out of the receiver in the same order that they came out of the VHF Data Link (VHF). AIS Class A position messages all send their sense of the seconds of the minute from the vessel's GPS. If the vessel is operating in SOTDMA mode, every 8th packet should also include the hour and minutes.

I started by taking a look at two minutes of data from a space based receiver (AprizeSat-3) and see what looks like a fairly rosy looking picture. I plotted the received packet count across the bottom and the seconds timestamp on the y-axis. Red "+" symbols are messages with just the seconds of the minute and the green "x" symbols are the reports that include hour and minute. The two major groups show that ships are reporting a sense of time that ranges about 5 seconds. This is likely delays in getting from assembling a packet to getting it transmitted. There are a small percentage of reports that do not fit the basic line. These outliers likely have a very poor sense of time or there is something wrong with the AIS transceiver. The points with seconds of 60 seconds or higher know they have a poor sense of time:
60 - time stamp is not available
61 - positioning system in manual input mode
62 - dead reckoning mode
63 - positioning system is inoperative)
Reporting that you don't know is a perfectly valid thing to do. Reporting that you know, but you really don't know, is a "bad thing."

With that promissing picture, I took a look at one receiver in the Boston area. This receiver can't hear downtown boston, so it doesn't have that many ships. Here is 10000 samples from the receiver. You can clearly see the minutes ticking over. You can also see the slope of the lines changes as the number of reported vessels changes. More vessels gives a lower slope, as the x-axis is number of samples (not time).

Looking closer, we can see that there are two trails of time that are offset by about 10 seconds.

I tried to come up with reasonable ways to keep track of when the minutes tick over. Then if I assume that this ship is reporting its position with a proper second with a transmission delay, I can generate a valid time for each report. I took a running average that is 60% of the average samples per minute. I can calculate this because I know I am looking at 24 hours of data (+/- a couple minutes).
cg_t_min: 2008-04-07 03:57:20
cg_t_max: 2008-04-08 03:57:37
duration_mins: 1440.28
num_samples: 211830
ave_samples_per_min: 147.08
window_len: 88
In the next two plots (just with different vertical stretches), I have plotted the window average and std deviation. The peak of the running average is right before the beginning of seeing ships with the next minute. This is likely an early best estimate of the minute transition. A likely better match would be the peak of the standard deviation, but we have to look back 1/2 the window size to get the sample that best matches the transition. This will only work if there is a roughly consistent number of ships. A rapid change in number of ships will throw this off.

There is a very strange separation of time in the middle of the minute, where this time seems to separate. I am not sure yet what is going on.

Slight different picks of the transiotion to show how things line up.

Thanks to Maria, Matt, Monica, and Tom for suggestions on how to look at this data.

Posted by Kurt | Permalink

08.24.2009 10:36

SF Mapping - Federal Stimulus


California is receiving $3 million in federal stimulus money for an
undersea mapping project in the San Francisco Bay Area, federal
officials have announced.  ... The research surveys will map the sea
floor, measure water depth, identify storm debris and wreckage, and
identify coastal seabeds and fragile aquatic life, according to the
U.S. Commerce Department, which includes NOAA.  The Commerce
Department said the projects will create jobs, support marine commerce
and trade, and help coastal planning efforts.  The Bay Area surveys
will encompass about 112 square miles as part of a multi-year effort
to create the first comprehensive maps of the state's seafloor and
marine resources, according to the department.

CONTACT: Shannon Gilson, Commerce Department (202) 482-2237
         David Miller, NOAA (202) 482-0013
         John Ewald, NOAA (240) 595-0640

Posted by Kurt | Permalink

08.22.2009 13:22

noaadata 0.43

My ais code library has a new version:

  • ais_normalize now generates proper checksums. plus code cleanup
  • ais_build_sqlite:
    • Now validates checksums
    • Msgs 1-4 now do commstate
    • suggest at least one sql index
    • Don't crash on short lines
    • Added messge 19 for class B
    • Better handling of uscg metadata. Still not handling them all
  • Added ais msg 11 to decoding
  • Msgs 1-4 are now hand coded to read comm state
  • ais/__init__.py now has proper names for messages 11,13,16,17, and 22
  • Added these uscg metadata fields to db: cg_s_rssi, cg_d_strength, cg_t_arrival, cg_s_slotnum
  • aisutils/uscg.py: loosed regex to match bad T messages. Doced all field types
  • Template is more better. Input from Trey Smith
  • ais_liststations. cleanup and progress counts
  • ais_position_in_polygon gives a better error message when pcl-core is missing. Still need to switch to shapely
  • New scripts: ais_nmea_info.py, ais_nmea_remove_dups.py, ais_nmea_uptime.py, ais_sqlite_bsreport_analysis.py
  • ais/commstate handles parsing of the SOTDMA and ITDMA commstates

Posted by Kurt | Permalink

08.21.2009 14:28

Guest blogging on DisasterCam blog

Last week, I wrote a guest blog post for the NASA Ames GeoCam. It's up on their GeoCam / DisasterCam blog. I visited Ames to collaborate on coding and to see about using the Google Phone for passing realtime information with NOAA disaster response groups (e.g. ERMA) and to give out whale position notices.

Sonoma Air Attack 2009/08/07 [disastercam.blogspot.com]

Posted by Kurt | Permalink

08.21.2009 10:53

Large gas plume off of northern California

Jim Gardner here at CCOM and the crew of the NOAA Okeanos Explorer recently discovered a large gas plume in the water column off of northern California. More weight to the argument that collecting full water column data with multibeam sonars can be really important.

If you have AGU membership and can pull EOS electronically (I can't for some reason), the article is in EOS V 90, N 32, 11 Aug

There are movies made with the new mid-water functionality in Fledermaus 7 that shows the plume in 3D. (Based on Roland's technology development in GeoZui4D)

Movies directory at CCOM

Trackback: All the details of the video area available here - IVS3D - 1400 Meter Plume Discovered off the Northern California Coast

Posted by Kurt | Permalink

08.20.2009 16:22

Spaced Based AIS (S-AIS)

Last week, I took my first look at Space-based AIS (S-AIS). For this quick visualization, I used 2 minutes of data from SpaceQuest's two AprizeSat receivers. In two minutes, they collected about 7000 class A position reports with up to 22 reports from each ship and a total of 121 unique MMSIs. Nice!

SpaceQuest just presented these results at TExAS III.


Posted by Kurt | Permalink

08.19.2009 11:32

USCG T flag time base means what?

I was looking to use the USCG "T" metadata field to help check the validity of the sense of time for an AIS messages, but then I ran into this triplet of basestation report receives. This is the same message received at 3 receivers in the Boston area. "T" is supposed to be the receive time seconds within the UTC minute.
I dropped that section of messages in a SQLite db to take a look at what is going on.
ais_build_sqlite.py -d trouble.db3 trouble.ais -C
{1: 14, 2: 0, 3: 0, 4: 5, 5: 0, 18: 0, 19: 0}
Pulling the reports:
sqlite3 -html trouble.db3 \
  'SELECT key, UserID as MMSI, \
          Time_day as day, Time_hour as hour, Time_min as min, Time_sec AS sec, \
          sync_state, slot_timeout, received_stations, cg_timestamp, \
          cg_t_arrival, cg_r \
   FROM bsreport WHERE userid=3669971;'
Here is what I see for those three basestation reports:
3 3669971 7 3 57 49 0 7 18 2008-04-07 03:57:50 40.92853791 r003669945
4 3669971 7 3 57 49 0 7 18 2008-04-07 03:57:25 11.02029935 r003669947
5 3669971 7 3 57 49 0 7 18 2008-04-07 03:57:46 50.26997285 r003669946
I will abbreviate station names by the last two digits... 45, 46, and 47. It looks like station 45 timestamped the the packet as 1 second after it was sent. That's pretty good. However the receiver generated the T as 9 seconds earlier than the station timestamp and 8 seconds before the basestation thinks it generated the message. This looks like the receiver doesn't have a good sense of time.

Looking at station 47 next, things totally come apart. The USCG timestamp is 24 seconds before the basestation thinks it sent the message and the T receive time is either 38 or 22 seconds off (assuming that the it is less than one minute off.

Taking a look at the comm state of a SOTDMA Class A position report in that batch of data, it shows that things for station 45 are at least consistant.
SELECT commstate_utc_hour, commstate_utc_min as com_min, timestamp as sec, cg_timestamp, 
	cg_t_arrival, cg_r 
    FROM position 
    WHERE commstate_utc_hour IS NOT NULL;
3 57 48 2008-04-07 03:57:52 42.90231408 r003669945
There doesn't appear to be a way to track the receivers' GPS locks so that I can know when to trust the T field.

Posted by Kurt | Permalink

08.18.2009 23:31

List of possible ferry vessels from AIS

The shipandcargo field in AIS message 5 (shipdata or "static" ship data) mixes the type of ship and what it is doing. This makes it sometimes impossible to pull a class of vessel. We can use the message 5 to narrow down the possibilities. Over the past week, I have been trying to figure out how to come up with the best list of ferry vessels in the US and Canada (including the Great Lakes). I know I will miss those in places like Lake Tahoe and ships like the Mt Washington in Lake Winnipesaukee. But this is the best I have come up with so far. (I'm leaving out any ferries that might be using Class B AIS).

First, I need to extract all unique type 5 messages that I've received. I have data logged as PREFIX-YYYY-MM-DD.bz2
for file in *20[0-9][0-9]-??-??.bz2; do 
    basename=`basename $file`
    echo $basename

    bzcat $file | ais_normalize.py > $tmp

    # Lookback:  125 stations * 1000 messages = 125000
    egrep 'AIVDM,1,1,[0-9]?,[AB],5' $tmp | ais_nmea_remove_dups.py -v -l 100000 > $basename.5.unique
In the above script, I first normalize the AIS messages such that they all take just one line. Then I use egrep to pull just msg 5 (shipdata). I pass that to ais_nmea_remove_dups that looks at the payload of each message and passes only one copy of each message payload - the 5th field - in bold:
I then run that same duplicate removal process on all the days combined together and put them in an SQLite database:
cat *.5.unique |  ais_nmea_remove_dups.py -v -l 100000 > 5.unique.nmea
ais_build_sqlite.py -C -d 5.unique.db3 5.nmea
{1: 0, 2: 0, 3: 0, 4: 0, 5: 92895, 18: 0, 19: 0}
Not that I started with 94118 message 5 lines. I'm seeing about 1.3% of the messages being undecodable receives of the wrong message length.

Now I can start doing SQL queries to pull apart the data. The first route is to look for self declared high speed craft or passenger vessels. Ferries, like the Cat, can be either a passenger (60-69) or highspeed (40) craft, but not both in a message 5.
sqlite3 5.unique.db3 'SELECT shipandcargo,count(*) FROM shipdata WHERE shipandcargo in (40,60,61,62,63,64,65,66,67,68,69) GROUP BY shipandcargo;'
shipandcargo    count(*)
40      535
60      5876
61      104
62      67
63      47
64      278
65      10
66      45
67      40
68      243
69      3374
The next thing is to bring in the flag under which the vessel is registered. This is given by the 1st three numbers of the MMSI (aka UserID). To get this number from a database record, just devide the MMSI by 1,000,000 (thanks to BRC for this trick). Loading the mmsi table:
sqlite3  5.unique.db3 < /Users/schwehr/projects/src/noaadata/data/mmsi_prefix.sql
Now do a LEFT JOIN to list vessel info and registry flag:
SELECT userid AS mmsi, name, callsign, imonumber, shipandcargo,
       dima+dimb AS width, dimc+dimd AS length, synonym AS flag 
       FROM shipdata
       LEFT JOIN mmsi_prefix ON mmsi_prefix.code==shipdata.userid/1000000 
       WHERE shipandcargo IN (40,60,61,62,63,64,65,66,67,68,69) 
       GROUP BY mmsi,name;
The LEFT JOIN is critical as some of the vessels do not have matching entries in the MMSI prefix table.
1193046,"SAMUEL S COURSEN@@@@",WDB3537,303174162,60,55,19,Null
1193046,"SAMUEL S CURSEN@@@@@",WDB3537,303174162,60,55,19,Null
3160015,"LADY WHITE HEAD     ","VC8150 ",7947960,60,34,10,Null
4490052," @LWHA              ","WY397P ",8835358,60,116,22,Null
31602656,"TANDEM              ",CFB5593,813929,69,20,7,Null
35516535,"P/B WANDERER        ",@@@@@@@,0,64,15,6,Null
35517401,"P/B PHANTOM         ",@@@@@@@,0,69,20,6,Null
36677150,"NAPA                ",WDE8110,0,60,38,10,Null
100000000,"BRITANNIA           ","VY5348 ",0,60,0,0,Null
107502700,"ROBERT H. DEDMAN    ",WCY9892,920700300,69,83,23,Null
111111111,"BLUE ICE            ",WCW8492,1111111,60,0,0,Null
111850000,"FAST BULLET         ",WDA8488,366854890,60,50,10,Null
123456789,"GOOSE BUMPS         ","KUME   ",123456787,60,30,6,Null
123456789,"MADDEN TIDE@@@@@@@@@",WCX9599,0,60,0,0,Null
235762000,"QUEEN MARY 2        ","GBQM   ",9241061,60,345,41,"United Kingdom"
Vessels with bad MMSI settings in their transcievers must be discarded. In the position reports, the MMSI is the only distinguishing value, so we will be unable to tell them from other vessels with that same error.

The next thing we can do is to use knowledge about how ferries have to operate in the US. They have to be registered with the United States, so we can reject any ship that is not registered to the United States, Canada, or Alaska as not possibly being a ferry. (I hope this is really true.)

We can therefor add "AND userid/1000000 in (303,316,338,358,366,367,368,369,379)" to the query:
SELECT code,synonym FROM mmsi_prefix 
   WHERE code IN (303,316,338,358,366,367,368,369,379);
code    synonym
303     Alaska
316     Canada
338     United States
358     Puerto Rico
366     United States
367     United States
368     United States
369     United States
379     United States Virgin Islands
The final query is:
SELECT userid AS mmsi, name, callsign, imonumber, shipandcargo,
       dima+dimb AS width, dimc+dimd AS length, synonym AS flag 
       FROM shipdata 
       LEFT JOIN mmsi_prefix ON mmsi_prefix.code==shipdata.userid/1000000 
       WHERE shipandcargo IN (40,60,61,62,63,64,65,66,67,68,69) 
           AND userid/1000000 IN (303,316,338,358,366,367,368,369,379) 
       GROUP BY mmsi,name;
A little cleaning and using "sqlite3 -csv" give this result:
303472000,"ISLAND CAPER",WAM9248,0,60,29,6,Alaska
303488000,"BERNIE MCALL",WDC6875,936351000,60,0,0,Alaska
303509000,"SPIRIT OF YORKTOWN",WDD5898,8949472,60,68,12,Alaska
303556000,"MARITIME INSTRUCTOR",WCW8625,1033659,60,26,5,Alaska
303677000,"MISS PAMELA",WDC3158,9111254,60,40,9,Alaska
303913000,"GORDON GUNTER","WTEO",8835255,60,205,32,Alaska
316001111,"NAUTILUS EXPLORER",CFG7418,9224647,60,35,8,Canada
316001232,"BOWEN QUEEN","VGXR",6600967,69,88,18,Canada
316001234,"HOWE SOUND QUEEN","CY3842",801691,60,80,20,Canada
338086000,"MISS CARISSA",WDB7801,8977699,69,39,8,"United States"
338088000,"MISS BETH",WDB7803,8977687,60,36,9,"United States"
338092962,"SEABRIGHT",,0,67,20,5,"United States"
338116000,B AP3D P AP,SD U<P4,9265809,40,36,14,"United States"
338116000,FAIRWEATHER,WDB5604,9265809,40,72,22,"United States"
338117000,"JENNY MCCALL",WDC3101,931679900,60,54,11,"United States"
338163000,COPACABANA,WDE6748,1214699,60,53,9,"United States"
338194000,"PHILIP ALAN MCCALL",WDB7424,108745800,60,51,10,"United States"
338195000,"MATANUSKA","WN4201",5228827,69,124,22,"United States"
There are some doubles because of operator error. For example:
316011408,"COASTAL INSPIRATION",CFN4970,9332767,69,160,28,Canada
316011408,"COAUTAL INSPIRATION",CFN4970,9332767,69,160,28,Canada
and there are some likely corrupted packets:
338116000,B AP3D P AP,SD U<P4,9265809,40,36,14,"United States"
Some of these are clearly cruise ships and other vessels that are really not ferries. It takes extra work to get down from 490 vessels to the true set of ferry vessels.

Posted by Kurt | Permalink

08.18.2009 16:50

AIS time sources - basestations verses ships

There are many sources of time in AIS and not all source of time are created equal. Here I am comparing the USCG timestamps to the basestation reported time and ship's time reported in the position report communication state for 3 stations in the Boston area. Most users of AIS are using it for real-time collision avoidance, so they are not concerned as concerned with timestamping. However, we would like to know that exact time that ships approach hydrophones in the area. First a graph of a while day. On the x-axis is seconds from the beginning of the data, while the y-axes is the difference between the time source and the N-AIS timestamps.

Overall, the 2 base stations agree with the timestamps recorded by the N-AIS stations within about 60 seconds (the thick green line of points near 0 seconds). For the most part, the clocks for all the ships in the area are equally close to the N-AIS time and not visible behind the mass of green. However, late in the day, there appears to be a lot of reports of ships' time being very far off from N-AIS. Turns out this is primarily just one ship having issues: CMA CGM ELBE. There is no way to know why it is off by up to 80000 seconds. If anyone knows what type of AIS transceiver they have and what likely went wrong, I would really like to know.

Baring the CMA CGM ELBE, it's worth a look up close at 42 minutes of data:

Here we can see that, in general, the ships' time and basestation time is tracking pretty well.

Posted by Kurt | Permalink

08.18.2009 16:32

Horsing around on the Healy

Perhaps some of the crew has been away from land too long. Scientific graffiti?


Posted by Kurt | Permalink

08.18.2009 14:38

AIS Class A position database entries

Since I keep needing this, here is my database cheat sheet for AIS Class A position messages. Note that this does not match what ais_build_sqlite.py creates right now, but it's close. In addition to additional fields, I'm ditching DECIMAL for REAL.
CREATE TABLE position ( 
	MessageID INTEGER,		-- 1 or 2 for SOTDMA or 3 for ITDMA
	RepeatIndicator INTEGER,	-- 0 means good.  1-3 means the message has been repeated (DANGER!)
	NavigationStatus INTEGER,
	PositionAccuracy INTEGER,
	longitude REAL,
	latitude REAL,
	TrueHeading INTEGER,
       	sec_timestamp INTEGER,
	--RegionalReserved INTEGER,	-- Replaced by the next field
        special_manoeuvre INTEGER,	-- 0 N/A, 1: not engaged, 2: engaged
	sync_state INTEGER,		-- 0 UTC direct, 1 UTC indirect, 2 sync to bs, 3 sync to mobile to bs
	slot_timeout INTEGER,		-- frames until new slot.  Controls comstate type
	received_stations INTEGER,	-- SOTDMA: # of stations this receiver hears
	slot_number INTEGER,		-- SOTDMA: slot number used for transmission of this message
	commstate_utc_hour INTEGER,	-- SOTDMA: Hour of the day
	commstate_utc_min INTEGER,	-- SOTDMA: UTC minutes
	commstate_utc_spare INTEGER,    -- SOTDMA: Should be zero
	slot_offset INTEGER,		-- SOTDMA: Offset to next slot used for next transmission. 0 if done talking
	slot_increment INTEGER,		-- ITDMA:  Offset to next slot for ITDMA.  0 if done talking
	keep_flag INTEGER,		-- ITDMA:  1 means keep the slot allocated for an additional frame.
	slots_to_allocate INTEGER,	-- ITDMA:  Slots to allocate.

        -- USCG metadata fields
        cg_s_rssi INTEGER,		-- 's' Receive strength indicator
        cg_d_strength INTEGER,		-- 'd' dBm receive strenth
        cg_x_count INTEGER,		-- 'x' Count of messages received by that station
	cg_t_arrival INTEGER,	        -- 't' HHMMSS.SS receive time
	cg_s_slotnum INTEGER,		-- 'S' Slot number of receive.  9999 means?
	cg_r VARCHAR(15),		-- Receiver or basestation name
	cg_sec INTEGER,			-- Unix UTC receive timestamp to second resolution
	cg_timestamp TIMESTAMP 		-- UTC time of receive to second resolution

Posted by Kurt | Permalink

08.18.2009 13:05

python and configure like status reports

With python 2.5 came conditionals (PEP 308. In C, conditionals look like:
char *results = (retval ? "true string" : "false string");
In python, the syntax is a bit different, but is easier to read if you are not used to it. Conditionals make it easy to do GNU autoconf configure like reporting of tests (or unittest with verbose mode in python).
#!/usr/bin/env python

import sys

def good():
    return True

def bad():
    return False

print >> sys.stderr, 'trying good...\t',
r = good()
print >> sys.stderr, 'ok' if r else 'FAILED'

print >> sys.stderr, 'trying bad...\t',
r = bad()
print >> sys.stderr, 'ok' if r else 'FAILED'
Running the above script looks like this:
% results.py
trying good...  ok
trying bad...   FAILED

Posted by Kurt | Permalink

08.17.2009 22:13

Technical Exchange of AIS via Satellite (TExAS)

Update 18-Aug-2009: TexasIII

This week is the invite only S-AIS meeting in D.C. I'm not one of the the attendies, but the last TExAS meeting from Sept 2008 is online:

Technical Exchange of AIS via Satellite (TExAS) II
2. USCG National AIS Project Increment 3 - Briefer: Mr E G Lockhart
3. Volpe MSSIS/IALA.net - Briefer: Dave Phinney, Volpe Center
4. Naval Research Lab (Gladis) - Briefer: Jim Tugman, NRL
5. Analysis of Spacequest acquired AIS signals - Briefer: Dino Loenzini, Spacequest
6. Status of AIS Frequencies Nationally/Internationally - Briefer: Joe Hersey, USCG
7. Technical Limitations and Possible Solutions - Briefer: Ross Norsworthy
8. Report on AIS via COMDEV Briefer: George - Best, COMDEV
10. AIS Analyses results and standards - Briefer: Dave Pietraszewski, USCG R&D Ctr
10-2. Status of Ship Wakes - Briefer: Jake Tunaley USCG R&D Ctr
12. Radar Sat Processing Tech - Brief Briefer: Dr Jake Tunaley
13-2. TerraSar Basic Product Spec Document
14. OGMSA overview - Briefer: Guy Thomas, OGMSA

Posted by Kurt | Permalink

08.17.2009 11:15

Neptune LNG safety zone

USCG Sets Safety Zones Around Neptune LNG Deepwater Port Submerged Turret-Loading Buoys

SUMMARY: The Coast Guard is establishing two temporary safety zones
extending 500 meters in all directions from each of the two submerged
turret- loading buoys and accompanying systems that are part of GDF
Suez Energy's Neptune Deepwater Port located in the Atlantic Ocean off
of Boston, Massachusetts. The purpose of these temporary safety zones
is to protect vessels and mariners from the potential safety hazards
associated with construction of the deepwater port facilities and the
large sub-surface turret buoys, and to protect the deepwater port
infrastructure. All vessels, with the exception of deepwater port
support vessels, are prohibited from entering into, remaining or
moving within either of the safety zones.

DATES: This rule is effective from July 31, 2009 through February 16,

Posted by Kurt | Permalink

08.17.2009 10:57

CCOM page about the current Healy cruise to the Arctic

Here is the CCOM HEALY 09-05 Research Cruise page. Andy Armstrong has been sending back reports via emails over the satellite communications systems: Daily Notes from the Arctic: A blog by Andy Armstrong, Physical Scientist/Co-director UNH Joint Hydrographic Center

Some example images from the team work going on between US and Canada for joint operations doing both seismic profiling and multibeam data collection.

The USGS also has a cruise blog with an RSS feed that contains many of the same posts as what is at CCOM.

And here is a great picture of the director of CCOM, just before storming the Healy:

Meanwhile, the rest of us CCOM are back in NH with 90° F and humid weather.

Posted by Kurt | Permalink

08.14.2009 15:12

Sqlite tricks - .sqliterc and html mode

I am doing yet more work with sqlite and was getting annoyed with having to setup sqlite how I like it. The '|' is not the most friendly separator; I do like having the column names at the top; and the NULL value should be some form of "Null" or "NULL". I usually do:
sqlite3 foo.db
.header ON
.separator "\t"
.nullvalue "Null"
I looked at the man page for sqlite3, and it indeed does respect a .sqliterc. I'm much happier with those defaults in there. I also noticed that there are "csv" and "html" modes. The html mode outputs the table rows, but not the overall table, so I have to wrap the command in two echo's to get a working (but not complete html file):
( \
  echo "<table>"; \
  sqlite -html 20090506-2.db3 'SELECT name,shipandcargo FROM shipdata LIMIT 10;' \
     | grep -v 'Loading resources' \
     | tr '@' ' '; \
  echo "</table>" \
)  > foo.html
There are two quirks here. First, sqlite prints out a status line that it is reading the init script to stdout even when in non interactive mode:
-- Loading resources from /Users/schwehr/.sqliterc
Second, I'm dealing with AIS strings that have not had the no character designator, '@', scrubed from the database. tr does a nice job of turning those into spaces. The results:

Posted by Kurt | Permalink

08.13.2009 10:56

AIS and bitstuffing issues

The whole binary message creation process is suffering from the lack of a well thought out guide to creating messages that work well. I was mostly focusing on what I have to decode and hadn't been thinking about how things were going across the network. This means I had not though about bit stuffing.

For a single slot message, if we use all 168 bits, then we will have room for 4 bit-stuffing additional bits. Bit stuffing adds a 0 bit every time there is 5 1 bits in a row. Therefore 11111 becomes 111110. This is generally not a big deal until you realize that most of our default / unavailable values have been the highest value for a field. So for a 7 bit field, we would make unavailable be 127 (111111 going out as 11111011). Greg pointed out that if we use 121 as unavailable, which is fairly common, then we have a better scituation:
from BitVector import BitVector
bv = BitVector(intVal=121)
print str(bv)
So for safety, the unvailables should be 0 (zero) if possible. Otherwise, here are the values that work for the high end.
A training 1 will likely cause the next high value to generate a bit-stuff event. Here is a table of possible default/unknown values. The whole point to try to keep as many messages as possible to just 1 slot.
NoBits MaxVal   Bits    Unavail   Bits    Loss-of-Range
 5       31       11111   30        11110 3%
 6       63      111111   60       111100 5%
 7      127     1111111  122      1111010 4%
 8      255    11111111  246     11110110 4%
 9      511   111111111  494    111101110 3%
10     1023  1111111111  990   1111011110 3%
I wonder when this will cause more confusion than help. Basically, I tried to keep no more than 4 ones in a row and make sure it is okay if you have two of these unknowns in a row.

Posted by Kurt | Permalink

08.12.2009 20:58

pbzip2 for parallel compression

I just heard about pbzip2, which does parallel compressing of files using pthreads. I gave the same 4.4G file to both bzip2 and pbzip2 on an 8-way linux box with pbzip2 using 7 threads. The original way I compress data:
time bzip2 -9 b
real    17m19.233s
user    16m41.131s
sys     0m8.296s
The new parallel way:
time pbzip2 -9 -p7 a
real    2m45.526s
user    18m33.926s
sys     0m13.202s
17:19 became 2:45. Not bad speedup. The compressed file sizes are not quite the same.
842988852 Aug 12 23:19 a.bz2
841630033 Aug 12 23:19 b.bz2
I uncompressed the file that I compress with pbzip2 using bzip2 to make sure there is no loss or incompatibility and it checks out with the original using cmp. For now, I'm going to stick with bzip2 for my critical data, but if I have to move fast, I might go for pbzip2.

Posted by Kurt | Permalink

08.12.2009 16:59

Where is the AIS tracking info on the missing ship?

Finding the path of the ship and what happened when up to the point where it left European waters, should be easy... just pull the AIS logs from one of the several tracking networks. I don't get why they don't know. I do this all the time for US waters.

Ship disappears after sail through English Channel
On July 28, the Arctic Sea made contact with British maritime
authorities as it passed through the busy English Channel. The ship
made a routine, mandatory report - saying who they were, where they
were from, where they were going and what their cargo was. It appeared
routine, said Mark Clark of Britain's Maritime and Coastguard Agency.
Where the ship was next spotted is uncertain. Russian media reports
say the last contact was on July 30 when the ship was in the Bay of
Biscay, and that it was later spotted by a Portuguese patrol plane,
but there was no contact.

But Portuguese Navy spokesman Commander Joao Barbosa said "we can
guarantee that the ship is not in Portuguese waters nor did it ever
pass through Portuguese waters."

Posted by Kurt | Permalink

08.12.2009 11:21

Plotting in matplotlib from databases

I've been trying to figure out how to get from data in the database to a matplotlib graph with the least amount of typing. I started with this:
fink install matplotlib-py26 ipython
ipython -pylab
import sqlite3
cx = sqlite3.connect('20080407.decimated.db3')
a = array( [tuple(row) for row in cx.execute('SELECT cg_offset, delta_t_sec FROM bs_time WHERE recvr=2;')] )
x = a[:,0]
y = a[:,1]
Trey suggested I just try transposing the array. My original matrix is pairs of x,y:
(50756, 2)
(2, 50756)
array([[  2.40000000e+01,   2.50000000e+01,   3.40000000e+01, ...,
          8.63840000e+04,   8.63940000e+04,   8.64040000e+04],
       [  0.00000000e+00,  -1.00000000e+00,   0.00000000e+00, ...,
          2.00000000e+01,   2.00000000e+01,   2.00000000e+01]])
Then I can use the "*" before the array in the call to plot to expand out the first dimension of the array to be the arguments... this pulls out x and y separately. The resulting ploting looks like:
plot(*array( [tuple(row) for row in cx.execute('SELECT cg_offset, 
      delta_t_sec FROM bs_time WHERE recvr=2;')] ).T)

Posted by Kurt | Permalink

08.12.2009 09:34


I've heard a lot about Hydropalooza from NOAA folks. Here are more details:

'Hydropalooza' Provides Deeper Understanding of Alaska's Kachemak Bay
Crews on board NOAA ships Fairweather and Rainier will conduct
hydrographic surveys of the seafloor, measuring depths and identifying
obstructions. When the ships complete data collection in early
September, they will have surveyed more than 350 square nautical miles
- an area nearly twice the size of Chicago.

"This is one of our largest survey efforts of the year," said
Capt. Steven Barnum, director of NOAA's Office of Coast Survey and
U.S. national hydrographer. "The data we collect will contribute to
navigation safety in the state and also be used to keep the coast
healthy and productive."
"Our data are used by countless federal, state, and local
stakeholders,ˇ said Captain Roger Parsons, NOAA Corps (ret.), IOCM
director. ´Coastline and seafloor data, once collected solely for
updating NOAA's national suite of nautical charts, is now used for
marine spatial planning efforts, ocean circulation monitoring, and
assessing the impacts of climate change, among other uses."

Posted by Kurt | Permalink

08.11.2009 19:28

building a Ubuntu debian package from a source deb

Trey and I are working to get some stuff packaged for Ubuntu. What follows is not totally perfect. If anyone has corrections or suggestions, please email me!

We started from the Ubuntu Hands On Packaging Guide. My initial attempt at packaging something focus on OpenLayers and I used jquery/1.2.6-2ubuntu1 as a guide.

Setup the initial world:
sudo apt-get install libjpeg-dev libpng-dev libblas-dev autoconf libtool \
  build-essential iblapack-dev libboost-dev imagemagick libexpat1-dev python-numpy \
  apache2 libapache2-mod-wsgi libreadline-dev flex # For GeoCam
sudo apt-get install devscripts build-essential wget fakeroot cdbs patchutils debhelper # debian install tools
Now, pull down the source debian for jquery. The dget and apt-get source command appear to be the same, except dget pulls a particular version.
mkdir -p deb_pkgs/jquery
cd !$
dget -xu https://launchpad.net/ubuntu/jaunty/+source/jquery/1.2.6-2ubuntu1/+files/jquery_1.2.6-2ubuntu1.dsc
# apt-get source jquery ### Instead of dget, you can do this line
cd ..
mkdir hello-debhelper
cd !$
apt-get source hello-debhelper
cd ..
Now that we have the two examples, it's time to get going on openlayers.
mkdir openlayers-2.8
cd !$
wget http://openlayers.org/download/OpenLayers-2.8.tar.gz
cp OpenLayers-2.8.tar.gz openlayers_2.8.orig.tar.gz
mkdir -p debian/openlayers-2.8
cd debian/openlayers-2.8
dh_make -e kurtschwehr@yahoo.com # Puts a ton of stuff in the debian directory
cd debian
rm *.ex *.EX # Remove un-need template files
Openlayers is a pretty simple package (but not as simple as jquery). We need to adjust the makefile in the debian directory named rules.
#!/usr/bin/make -f
# -*- makefile -*-
# Uncomment this to turn on verbose mode.
#export DH_VERBOSE=1

	(cd build; ./build.py)
	rm -f tools/*.pyc # Fails to build without this cleanup



binary-indep: install


binary: binary-indep

.PHONY: build clean binary-indep binary-arch binary install
OpenLayers needs to construct a "minified" library, so the build rule has to go into the build directory that comes with OpenLayers and run the build python script. However, this leaves python compiled files (pyc). These cause the debian create process to be unhappy.

The next file to modify in the debian directory is the control file.
Source: openlayers
Section: web
Priority: extra
Maintainer: Kurt Schwehr 
Depends: ${misc:Depends}
Suggests: javascript-common
Build-Depends: debhelper (>= 6)
Standards-Version: 3.8.0
Homepage: http://openlayers.org/

Package: libjs-openlayers
Architecture: all
Description: JavaScript web mapping
  OpenLayers makes it easy to put a dynamic map in any web page. It
  can display map tiles and markers loaded from any source. MetaCarta
  developed the initial version of OpenLayers and gave it to the
  public to further the use of geographic information of all
  kinds. OpenLayers is completely free, Open Source JavaScript,
  released under a BSD-style License.
In the control file, the blank line matters, the long description must be indented, and there must be a new line at the end of the file.

I am not sure what ${misc:Depends} does and note that the Package name is different. This libjs will let people search for all the JavaScript libraries.

The next step for me was to setup a GNU Privacy Guard key to sign the package as described in the online book The GNU Privacy Handbook
gpg --gen-key # Use all the default values
gpg --output kurtschwehr.gpg --export kurtschwehr@yahoo.com # Put in your name and email here
scp kurtschwehr.gpg schwehr.org:www/ # Put your public key somewhere for people be able to get it
The debian directory really does have to be inside the unpacked source tree.

Now we are ready to build the file.
cd ~/deb_pkgs/openlayers-2.8/openlayers-2.8
# enter your password 2 times to sign the package
ls -l libjs-openlayers_2.8-1_all.deb
This builds a binary deb that we can install in a moment. First we want to build the source debian.
debuild -S
ls -l ../openlayers_2.8-1.diff.gz
At this point we tried to test the package with pbuilder in it's own little space but never succeeded.
sudo pbuilder --create # Takes a while
sudo pbuilder build ../*.dsc
But this kept failing. First it complained about no debhelper 7 or greater, so I backed off to requiring debhelper 6 or greater. However, I still got this:
pbuilder-satisfydepends-dummy: Depends: debhelper (>= 7) but it is not installable
To finish it all off:
sudo dpkg -i libjs-openlayers_2.8-1_all.deb

Posted by Kurt | Permalink

08.11.2009 10:34

More Healy cruise info

CG Arctic Continental Shelf Research
A Coast Guard Cutter Healy boatcrew along with a scientist from
Scripps Institution of Oceanography recovered a sonobuoy, Aug. 8,
2009, while on a scientific research mission in the Arctic Ocean.
And the Healy has met up with the Louis St. Laurent.

Posted by Kurt | Permalink

08.11.2009 09:07

Unified robot operating systems

Robots to get their own operating system
But this happy meeting of robotic beings hides a serious problem:
while the robots might be collaborating, those making them are
not. Each robot is individually manufactured to meet a specific need
and more than likely built in isolation.

This sorry state of affairs is set to change. Roboticists have begun
to think about what robots have in common and what aspects of their
construction can be standardised, hopefully resulting in a basic
operating system everyone can use. This would let roboticists focus
their attention on taking the technology forward.
I have to say that attempts like this are actually common. Just one case in point is CLARAty [JPL] that was a colaboration of several NASA centers and universities.

Other buzzwords from projects unify things in the robotics world: NDDS, DDS/OpenDDS, ControlShell, LabView, TCX, TCA, RTC, CORBA, CMU-IPC, Microsoft Robotics Studio, TDL (Task Description Language), yada yada...

From Trey: "It is important that young people not understand how difficult things are." Basically... just go for it.

Posted by Kurt | Permalink

08.10.2009 15:46

Building your own debian dpkg world

WARNING: Work in progess. Use at your own risk

If you want total control over an environment, then you need to have your own personal package managed area. I prefer debian to rpm in general, so Trey and I are looking at building a small debian world on top of MacOS, Ubuntu, and Windows Cygwin. The goal is to layer this over say fink in MacOS and then we can do exactly what we want without impacting fink. This should even work on an RPM distribution. How to start? Build dpkg of course. Since I am a fink contributor, I'll use that as the starting point. I will be using exactly the same version of dpkg (1.10.21) to make sure the patch will work without a hitch. First, I copied the source from fink.
fink rebuild dpkg # Force fink to download the source
mkdir ~/foo && cd ~/foo
cp /sw/fink/10.4/unstable/main/finkinfo/base/dpkg.patch . # Pull in from fink
I'm going to try to keep things simple than fink and use gettext from my path (aka from fink). Therefore, I needed to remove the first two entries in the dpkg.patch that are for get text. Use your favorite editor here (emacs, right?).
tar xf /sw/src/dpkg_1.10.21.tar.gz
sed 's|@PREFIX@|/Users/schwehr/foo/world|g' < dpkg.patch | patch -p0
mkdir build && cd build
mkdir $rootdir
../dpkg-1.10.21/configure --prefix=$rootdir  --without-start-stop-daemon --without-sgml-doc \
  --with-admindir=$rootdir/var/lib/dpkg --mandir=$rootdir/share/man \
  --infodir=$rootdir/share/info --srcdir=$rootdir/../dpkg-1.10.21
make  all-include all-optlib all-lib all-dpkg-deb all-split all-scripts all-utils \
  all-main all-dselect all-po all-methods all-man all-doc -j 3
This fails with a ranlib error on the archive. Eh?
make # Seems to then work?  Will this blow up later?
At this point, it works and I need to figure out how to make a debian out of it to bootstrap the system. Calling make install here would defeat the purpose of package managing the system. To play with it, I went ahead and did the install. In the future, I should do something like:
make install DEST=somewhere
And then I can build a deb of it and get dpkg to self install the deb of dpkg.

Posted by Kurt | Permalink

08.10.2009 10:06

air tanker drop

Excellent video of air drop of fire retardant from last week.

Posted by Kurt | Permalink

08.10.2009 09:06

NOAA Press Release on Arctic mapping

NOAA Joins Other U.S. Agencies and Canada to Survey the Arctic Continental Shelf [NOAA News]

Larry Mayer, director of the Center for Coastal and Ocean Mapping and
co-director of the Joint Hydrographic Center, is the chief scientist
for the U.S. mission. NOAA's Andy Armstrong, a physical scientist and
co-director of the Joint Hydrographic Center, is the co-chief
scientist. NOAA and the University of New Hampshire jointly operate
the Joint Hydrographic Center.

The 41-day joint mission runs from August 7 to September 16 and will
see the U.S. Coast Guard Cutter Healy and the Canadian Coast Guard
Ship Louis S. St-Laurent operating together to obtain a variety of

"NOAA and the Joint Hydrographic Center will take the lead in
collecting bathymetric data from the Healy to map the seafloor, while
the Canadian icebreaker collects seismic data to determine sediment
thickness," said Craig McLean, deputy assistant administrator for
NOAA's Office of Oceanic and Atmospheric Research. "This
collaboration saves millions of dollars by ensuring data are collected
only once in the same area, and by sharing data useful to both
Full Resolution

Posted by Kurt | Permalink

08.09.2009 14:42

MarNIS on AIS, LRIT, etc

The Maritime navigation Information Services (MarNIS) project has some interesting papers online in their documents section. e.g. this paper by Tremlett and Klepsvik:

D2.1: Interim policy recommendations on WP 2.1 Integration of AIS functionalities
At present the following automatic ship reporting systems are either
adopted or proposed for adoption by IMO and FAO, including:

-	Global Maritime Distress and Safety System (GMDSS)
-	Automatic Identification System (AIS)
-	Long-range AIS (LR-AIS)
-	Long Range Ship's Identification and Tracking (LRIT)
-	Vessel Monitoring System for fisheries (VMS)
-	Ship security alert system (SSAS)
-	'Daily Report'

In addition, a number of commercial fleet management services based on
GPS positioning are in operation, for the most part based on
Inmarsat-C position reports.

While AIS Class A and Class B will meet the need of VTM for detailed
automated ship reporting within the coastal areas and in-land waters
covered by VHF shore stations, no such system has been adopted or
proposed on a global scale.  The recently adopted LRIT system will, in
its present form, only provide identity, position and time information
and basically cover maritime security applications.

The future take-up of AIS Class B by Non-SOLAS and leisure vessels
remains to be seen.  However, with a price tag of around 500 Euros and
offered as an optional item to VHF radios, it is likely that Class B
transponders will be well accepted by at least the larger leisure

However, there are some questions raised with respect to the
capability of AIS shore stations to handle the non-priority Class B
traffic in some areas with an adequate reporting rate to be of any use
to both the VTM authorities and the mariners. Such deficiencies may be
augmented using other available or proposed terrestrial communication
systems, i.e. WIMAX, DVHF or GSM/UMTS.  Solutions based on WIMAX using
a peer-to-peer methology may also offer ranges beyond VHF and GSM and
possibly serve as a 'gap-filler' between AIS and LRIT.

Posted by Kurt | Permalink

08.09.2009 09:51

Healy finds arctic ice

The Healy has found sea ice... I see from their track that they are moving off in a different direction. Just heard that this year, the team got out to the Healy via a landing craft rather than the normal helocopter ride.

Feeds are available here:


Posted by Kurt | Permalink

08.08.2009 16:50

iPhone note files on the Mac

I've been attempting to take some notes on the iPhone to see what I think of the platform. I synced the iPhone to my mac and tried looking for some place that might be hiding the notes. Turns out it is in the Mail app:
% ls ~/Library/Mail/Mailboxes/Notes.mbox/Messages
29343.emlx  29345.emlx  29347.emlx  29349.emlx  29351.emlx  29353.emlx
29344.emlx  29346.emlx  29348.emlx  29350.emlx  29352.emlx  29416.emlx
Time to take a look at what is in a note:
-Uniform-Type-Identifier: com.apple.mail-note
Message-Id: <693D209C-19BF-4687-BD99-DAC020E4ECDD@gateway.2wire.net>
From: =?MACINTOSH?Q?Kurt=D5s_iPhone?=
Content-Type: text/html;
-Universally-Unique-Identifier: 86207fd8-541f-434f-a374-84a9e6f735d9
Content-Transfer-Encoding: 7bit
-Mail-Generated-Subject: YES
-Mail-Created-Date: 2009-07-19 15:47:26 -0700
Mime-Version: 1.0 (Apple Message framework v935.3)
Subject: =?MACINTOSH?Q?=CAerrands?=
Date: 2009-07-24 15:51:48 -0700

mongolian beef<div>muffins&nbsp;</div><div>waffles</div><div>milk</div><div><br><div><br></div></div>
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
	<string>Kurt's iPhone</string>
This is almost an RFC822 message. I just need to remove that first line that is the message size(right?).
#!/usr/bin/env python

import email
import sys

em_file = file(sys.argv[1])
em = email.message_from_file(em_file)

content = []
for line in em.get_payload().splitlines():
    if '<?xml version="1.0" encoding="UTF-8"?>' == line.strip():

results = ''.join(content)
print results
After doing fink install html2text, I can parse my message:
% ./notes.py 29344.emlx | html2text
mongolian beef
All this because I use Thunderbird and don't want to open mail or to email notes to myself.

Posted by Kurt | Permalink

08.08.2009 09:31

CCOM/JHC team on the Healy and heading North

Last week, the CCOM/JHC science team for the Healy packed up and traveled up to Barrow Alaska. They are now on the Healy and steaming north for more sea floor mapping.

Also, there is a new US government website about the project: http://continentalshelf.gov/

Welcome to the Website for the U.S. Extended Continental Shelf
Project! The 2009 U.S.-Canada Arctic Continental Shelf Survey runs
from August 7 through September 16.

Posted by Kurt | Permalink

08.08.2009 09:07

Cal Fire Sonoma

Yesterday, I visited Cal Fire Sonoma air attack station and the Cal Fire Napa command station with Trey and Alex R. Both teams were supper hosts!

You can listen in to CalFire's radio traffic on ScanCal

Posted by Kurt | Permalink

08.08.2009 09:04

WHOI Silent buoy mooring lines

Since I work on ListenForWhales.org that uses these buoys:

WHOI Image of the Day - Silent buoy

Posted by Kurt | Permalink

08.08.2009 09:00

Controlling an aircraft with an iPhone / iPod Touch

I mentioned the idea of controlling an AUV via an iPhone before.

New Use for Your iPhone: Controlling Drones [Wired]

Posted by Kurt | Permalink

08.06.2009 10:27

GeoWeb standards

GeoWeb Standards - Where we are [HighEarthOrbit]

Andrew explains the basics of a range of formats that include ShapeFiles, Microformat-Geo & Adr, GeoRSS, KML, GeoJSON, GML, ...

I appreciated his discussion of GeoRSS. I hadn't looked at RDF and was seeing it being used as a serialization format. Now I know.
Despite it's widespread adoption, GeoRSS has some complexities that
arose out of it's development. There are 9 potential 'flavors' of
GeoRSS, although this is largely due to the 3 different formats of
feeds: RDF, RSS, and Atom. There are still 3 formats of GeoRSS itself
that can be utilized in any of the 3 feed formats: W3C, Simple, and
GML. This causes confusion for developers, especially since W3C format
is deprecated but still widely used. Perhaps this is one reason that
despite GeoRSS being a simple extension to existing feed formats,
there still is not GeoRSS support in any of the major news feed
readers, except perhaps limited support in FriendFeed.

In addition, GeoRSS hasn't really advanced in quite awhile despite
multiple requests and discussions of extensions for multiple
locations, time spans, external geometries, and feature
If this stuff interests you, also take a look at what Sean Gillies has to say:

It's the imagery, stupid of why we are using KML more than GML. And Geographic syndication in a nutshell

Posted by Kurt | Permalink

08.06.2009 08:19

Niagra Falls from an airplane with an iPhone

The iPhone camera is okay, but this shows that even a super cheap point and shoot with a real lense would have done better. Yesterday, I got my first sight of Niagra Falls. We were at 36000 feet.

Posted by Kurt | Permalink

08.06.2009 08:08

New England maritime stuff

The tall ships are coming into Portsmouth, NH this weekend. It will be crowded, but worth it to see them. Too bad I will be missing them.

One of the ships had some issues on the way: Tall ship Unicorn freed from rocks in Great Harbor

Local dredging is delayed in Dover: Cocheco River dredging delayed; to resume in fall 2010

Lobster fishermen can be intense folk: Lobster boats sunk after harbor shooting
Three lobster boats were sabotaged and two of them sunk in a
midcoast harbor amid growing tensions among lobstermen two weeks after
another lobster turf war escalated into a shooting on Matinicus
Island, law enforcement officials said Wednesday.

Posted by Kurt | Permalink

08.05.2009 21:37

Best practices for AIS message creation?

I'm reviewing some AIS binary messages and was noticing that one field in a message used the next value in the series to denote unknown, where another field in the same message used the highest value possible for the field to denote unknown. I decided to look at ITU.1371-3 at the position messages and found exactly the same thing. We desperately need a best practices document that details the style to follow. I know I read something that started to get into this, but can't find it in my stash of technical specs and it's not in:

USCG AIS Standards

Posted by Kurt | Permalink

08.05.2009 21:25

ROV Code of Practice - What about AUV's?

I remember making checklists for the NASA Ames Telepresence ROV (TROV) back in 1992. We always seemed to be learning things the hard way.

ROV Code Of Practice Updated
Designed for use by clients and contractors alike, the International
Marine Contractors Association (IMCA) 'Code of Practice for the Safe
and Efficient Operation of Remotely Operated Vehicles (ROVs)' has been
reviewed and updated to reflect technological and operational
developments, and includes additional guidance provided in particular
on safe crewing levels for ROVs equipped with the growing range of
tools packages in use within this fast-developing industry.
So where is code of practice for AUV's? I can think of some key things to include... starting with always make sure you have all the islands in the area digitized and in the planning tool before making any mission plans. (At least the vehicle called home on the iridium to tell the team where it was stuck...)

Posted by Kurt | Permalink

08.04.2009 17:01

Adventures in Oceanography - Dale Stokes

Via the Art T. news network... a presentation at the Scripps Aquarium. Building scientific instruments and lots of other stuff.

Posted by Kurt | Permalink

08.04.2009 16:54

NOAA move on the west coast

NOAA Selects Newport, Ore., as New Home of Marine Operations Center-Pacific

This is going to be a big shift of a lot of people I know.

The NOAA Marine Operations Center-Pacific is comprised of
approximately 175 employees, including more than 110 officers and crew
assigned to the NOAA ships McArthur II, Miller Freeman, Rainier and
Bell M. Shimada, a new fisheries survey vessels expected to join the
NOAA fleet in 2010.

Posted by Kurt | Permalink

08.04.2009 14:36

Fairpoint DSL troubleshooting

It's painful to be on the phone with technicians that can only use a playbook and can't think through a problem. My DSL line with Fairpoint has been getting flaky over the last few weeks. In the past, when I had trouble, they looked at the signal level and had to adjust the speed down for a while to get a reliable connection. Things got better for a while, but now is was unusable. I called the tech and I asked if he could check the logs to look at all the drops. He said he can't look at any technical details. He could only see if I was connected or not. He then proceeded to push me through reseting the modem and re-entering the user and password for fairpoint. How does intermittent connection equal wrong username and password? He never once had me look at the Westel's troubleshooting tools. I think it's the westel, so I went and got a Netgear DGN2000 with builting wifi and DSL. That seems to be doing well and does an okay job of showing me what is up with the connection. I think I will stick with the netgear device.

What should these numbers be?

Posted by Kurt | Permalink

08.02.2009 15:41

ringtone scams

Dear Congress,

This just shouldn't be allowed. I've purchased music on CD and ripped it into iTunes. I thought I would just quick make a ring tone. But no! It says I can only make ring tones from music that is purchased from the iTunes store. Since I didn't want to repurchase a song that I already bought, I went and got my first song from the iTunes store. I made sure it had the bell next to it for a ring tone. Once I had the song in my iTunes library, I asked iTunes to make a ring tone and it asked me to pay another $1. What? I've already payed for the song. Forget that. I have sound files that I've recorded myself. I own the copyright. This means that I can't make ring tones of audio that I have the total rights to. Ridiculous! I refuse to keep paying for this kind of thing. I own hundreds of CDs. I like supporting artists and I'll buy the CD and go to concerts, but not this.

Now, there is a work around. Garage Band will allow you to import audio files if they aren't DRMed and it exports ring tones. So if you have ripped a CD, go this route. You can put in up to 40 seconds of music into the ringtone.... Importing Music into GarageBand and Making Ringtones [Moni-blog]

I support the concept of copyrights for works, but not this kind of thing and what amazing things would happen if copyrights expired after 25 or 30 years!

Posted by Kurt | Permalink

08.02.2009 00:21

NMEA / AIS highlighting using python pygments

I've taken the first stab at a Lexer for pygments to colorize/highlight general NMEA statements and more specifically USCG old style AIS with receive metadata. It's by no means perfect, but it does get things going. Pygments uses regular expressions to allow parsing of languages and formats, so let me start with my current python re definition for the USCG N-AIS format:
uscg_ais_nmea_regex_str = r"""[!$](?P<talker>AI)(?P<stringType>VD[MO])
  | (,s(?P<s_rssi>\d*))
  | (,d(?P<signal_strength>[-0-9]*))
  | (,t(?P<t_unknown>[^,]*))
  | (,T(?P<time_of_arrival>[0-9.]*))
  | (,x(?P<x_station_counter>[0-9]*))
  | (,(?P<station>(?P<station_type>[rbB])[a-zA-Z0-9]*))
A traditional NMEA string stops at the checksum, so I should be able to handle sentences (one line of text) that either come with or withou the metadata. I created a file called: pygments/lexers/nmea.py
import re
except NameError:
    from sets import Set as set

from pygments.lexer import RegexLexer
from pygments.token import Text, Name, String, Literal

__all__ = ['NmeaLexer']

class NmeaLexer(RegexLexer):
    '''Data streams from GPS, AIS, and other sensors.
    Does not support the NMEA 4.0 TAG Block extensions

    name = 'Nmea'
    aliases = ['nmea', 'NMEA', 'GPS', 'AIS', 'ais']
    filenames = ['*.ais',] # And?

    flags = re.VERBOSE

    tokens = {
        'root': [
            (r'[!$][^,]*', String), #,'fields'), # Talker and message type
            (r',[^,*\n]*', Text), # Field
            (r'[*][0-9A-F][0-9A-F]', Name, 'uscg-metadata'), # Checksum and move into USCG metadata
        'uscg-metadata': [
            (r',[^drstTx0-9][^,]*',Name.Entity), # Unknown things
            (r',d[^,]*',Name.Builtin), # signal strength
            (r',r[^,]*',Name.Attribute), # Station id/name
            (r',s[^,]*',Name.Class), # RSSI
            (r',t[^,]*',Name.Constant), # unknown 
            (r',T[^,]*',Name.Decorator), # time of arrival
            (r',x[^,]*',Name.Function), # station counter
            (r'$',Name.Tag,'#pop'), # Make sure we pop.  Doesn't see to help
Since NMEA is line oriented, I had to add a flags member to the class to turn off the default re.M (multiline) behavior. Then, to enable the lexer, I edited _mapping.py and added this line to the LEXERS dictionary:
    'NmeaLexer': ('pygments.lexers.nmea', 'Nmea', ('nmea',), ('*.ais',), ()),
Now it is ready to try:

You can see that this test file with partial metadata on the 2nd to last line causes something to go wrong with the last line. Hopefully, the community can improve upon this start.

And yes, I used kodos to help me debug my typos in the regex's.

Posted by Kurt | Permalink

08.01.2009 21:25

MarineBuzz on Boston right whale listening network

Intelligent Buoys to Avoid Whale Ship Collisions [MarineBuzz]

The article links to a WHOI page that I hadn't seen before (I don't deal much with the acoustics):

Buoys Help Avert Whale-Ship Collisions Specially engineered mooring system detects whales and warns ships

See also: Maritime Monday 173 [gCaptain.com]

Posted by Kurt | Permalink

08.01.2009 20:37

message decimation reveals strange AIS transceiver behavior

I should know enough by now to not be surprised by things like this, but I am. I was running checks on my decimation algorithm. Using just one receiver and only position messages, I should see very few removed messages (or none). I got way more than I expected: 108 messages (0.12%). While looking at the messages, I saw very unexpected behavior. Most of these messages are from one vessel and are retransmissions of the same position message on the same AIS channel, but 0.03 to 0.19 seconds apart.
% grep ',120756572' 20080407.log.norm | grep 8LvSh5000Jr
I checked another receiver and it sees the same results, so it's not a receiver having hardware issues.

Just from looking at the encoded text, I can see that it is the same vessel (the MMSI is not changing) putting out both ITDMA (msg 3) and SOTDMA (msg 1). e.g.
% ais_msg_1.py -d '!AIVDM,1,1,,B,18LvSh5000JrdUnH@D=;k8k<0d0B,0*27,s24326,d-112,T50.42236808,x16523,r003669947,1207565728'
        MessageID:          1
        RepeatIndicator:    0
        UserID:             567256000
        NavigationStatus:   5
        ROT:                0
        SOG:                0
        PositionAccuracy:   0
        longitude:          -71.06358167
        latitude:           42.388566667
        COG:                302
        TrueHeading:        281
        TimeStamp:          38
        RegionalReserved:   0
        Spare:              0
        RAIM:               False
        state_syncstate:    1
        state_slottimeout:  3
        state_slotoffset:   18
This is a ship sitting at a dock on the north side of Boston to the west of the Tobin Bridge. Its shipdata messages (msg 5) are consistant:
% sqlite ais.db3
SELECT imonumber,callsign,name,shipandcargo,dima+dimb,dimc+dimd,fixtype,destination FROM shipdata WHERE userid=567256000 GROUP BY imonumber,callsign,name,shipandcargo LIMIT 10;
IMOnumber callsign  name          shipandcargo dima+dimb  dimc+dimd  fixtype destination
8312174   HSBE      THOR ALLIANCE 70           189 m      31 m       1       BOSTON,USA
If anyone is able to find out what type of AIS unit the Thor Alliance, I would really appreciate hearing from you. I am very curious to know why this ship's AIS is violating the ITU 1371-3 specification for transmissions. Or is it's working fine, then there is something in the data collection hardware or software.

Posted by Kurt | Permalink

08.01.2009 15:16

AIS duplicate removal

This post is about post-processing AIS logs.

I've been thinking through the process of removing duplicate receive messages for the situate where I've got multiple receivers with partially overlapping coverage and vessels are transiting across the area. My first thought was to use time slices. If I know when the messages went out, then I should be able to get just look at time window and remove other packets from that vessel at that time. However, if timestamping isn't completely accurate, this method breaks down. Turns out that there should be a simpler method to decimating packets. If I totally ignore the metadata and just use the NMEA data payload text fields as a hash, I should be able to see if I have already passed through a packet with that hash and drop it if it's already come through.

I see two risks with this method. First, the table of hashes has to be bound. If I need to keep 5 million hashes around and search this set for each new packet, the decimation process is going to get very slow. It should be possible to calculate a reasonable FIFO length of hashes based on the number of stations reporting AIS data. When a new packet comes in and the queue gets longer than the maximum number of hashes, we simply drop the oldest hash. If necessary, this can be augmented to be a FIFO for each message number to speed up the searching.

The second problem is a little more trouble. What if we get two messages with the same hash that are really are at different times? This can happen with vessels that are working in a constrained area, e.g. operating in dynamic positioning (DP) mode or tied to a dock. With the LNG construction off of Boston, is is possible to get repeats of the position. However, it takes more than a repeat in the longitude and latitude. The ROT, SOG, heading, and commstate all have to be these same. These fields are likely changing, although there should be lots of repeats in commstate for SOTDMA. ITDMA position messages will not have as much variation in the commstate as SOTDMA, but these packets are less common. One strategy would be to track the receive station with the hash and emit the new message if it is a repeat of that heard from the same station. The hash strategy should work for most packets, but may have undesired results for tracking repeating notices where the content is the same (e.g. right whale notices where the buoy status has not changed). If I felt that I could trust time, some of these issues would go away.
o = file('unique.ais','w')

dropped = 0
for line_num, line in enumerate(file('20080407.log.norm')):
    if line_num % 10000 == 0:
        print 'line_num:',line_num, '  dropped:',dropped
    if '!AIVD' != line[:5]:
    fields = line.split(',')
    payload = fields[5]
    if payload in payloads:
        dropped += 1
    if len(payloads) > lookback_dist:
Running this on 262000 messages from 3 receivers in the Boston approaches dropped 89000 messages (34%).

Here is OpenDiff showing the change from the original (on the left) to the version without the duplicates on the right.

Posted by Kurt | Permalink

08.01.2009 13:52

EPA has a research vessel

I didn't know until yesterday that the EPA has an ocean going research vessel: Ocean Survey Vessel (OSV) Bold - I'd call is either S/V or R/V, but that's just me.
The Ocean Survey Vessel Bold (The Bold) is EPA's only ocean and
coastal monitoring vessel. The Bold is equipped with state-of-the-art
sampling, mapping, and analysis equipment including sidescan sonar,
underwater video, water sampling instruments, and sediment sampling
devices, which scientists can use in various monitoring
activities. The vessel is a converted U.S. Navy T-AGOS class vessel
and is 224 feet long and 43 feet wide. EPA acquired the Bold on March
31, 2004.
How did I find out? Ahoy Matey! [The White House Blog]

Posted by Kurt | Permalink

08.01.2009 13:43

ocean energy at UNH

Shaheen: Funds to aid UNH ocean lab [seacoastonline]
U.S. Sen. Jeanne Shaheen, D-N.H., told staff following a tour of the
center that the Senate has approved $750,000 for their work as part of
the Energy and Water Appropriations bill. Although the bill still has
to go to a conference committee, she said she's confident the funding
will pass through unscathed.

Posted by Kurt | Permalink

08.01.2009 13:08


OpenSeaMap looks to do the same as OpenStreetMap but for nautical charts. This is less important in the United States where NOAA releases charts for free, but even in the US, this could have interesting uses.

Here is a screen shot. This English is a bit strange because the original is German and automatic translaters are only so good.

Link found via Friday Geonews Cleanup: Proprietary vs Open Source Geospatial, 3D GIS, OpenSeaMap and Much More [slashgeo]

Posted by Kurt | Permalink