crosspad communication protocol $Id: PROTOCOL,v 1.11 1998/11/11 06:42:04 itojun Exp $ Protocol ======== crosspad>pc: 02 xx xx xx xx cc 02: sync byte (we may be able to deduce the communication baud rate by this byte) xx xx xx xx: data length cc: checksum, xor'ing all "xx" bytes (correct?) pc>crosspad: 06 06: acknowledgement crosspad>pc: 01 ss SS xx xx .... yy 01: sync byte ss: sector# (starts from 00) SS: ff ^ sector# (starts from ff, and decends) xx xx: data length of the sector, 0x1000 (4096) in maximum case ....: (data length) bytes yy: extra byte, meaning unknown (checksum?) NOTE: what happens if sector# exceeds 0xff? pc>crosspad: 06 06: acknowledgement crosspad>pc: 07 07: no more to send File format =========== NOTE: this is not a *.nbk file format. This section describes the raw data sent from crosspad device. ** Date format MM DD YY HH MM SS: all digits are in BCD. For example, "Fri Sep 11 21:38:52 JST 1998" would be 09 11 98 21 38 52. ** format outline for "version 0" format There are several versions in encoding ink data. CrossPad emits "version 0" format of the data. A data file contains one or more segments. A segment is formatted as follows: 00 00 00 00 xx MM DD YY HH MM SS yy ... 00 00 00 00: version identifier. this denotes "version 0" format. xx: segment code MM DD YY HH MM SS: the date the segment was generated. yy ...: data, length and format determined by code NOTE: The data is like transaction log. The order of data stream is defined by date/time. Therefore, page# may go back and forth if the user flip page back and forth and write things on various pages. ** Segments 01: huffman encoded stroke data (data len=variable) 00 00 00 00 01 MM DD YY HH MM SS ll xx xx yy yy zz ..... ff ff pp pp ll: data length from "ll" to "pp pp" xx xx yy yy: x/y coordinate of starting point zz ...: huffman encoded stroke data ff ff: terminator of huffman encoded stroke data (it is actually 0xff 0xff) pp pp: # of points described by huffman encoded stroke data, includes starting point 02: huffman encoded stroke data (data len=variable) Mostly same as code=01, but code=02 indicates that the pen has left crosspad once and reached crosspad again (i.e. code=02 seems to mean that radio receiver in the pad have lost the transmitter in the pen) 03: keyword seletion (data len=0x05?) 00 00 00 00 03 MM DD YY HH MM SS 04 xx xx xx xx 04: maybe length? xx xx xx xx: unknown, usually 00 xx 00 xx 04: page # (data len=0x04) 00 00 00 00 04 MM DD YY HH MM SS xx xx xx xx xx xx xx xx: page # equals to this value from here NOTE: The data from the pad is like transaction log. The order of data stream is defined by date/time. Therefore, page# may go back and forth if the user flip page back and forth. 05: clock adjust (data len=0x06) 00 00 00 00 05 MM DD YY HH MM SS MM DD YY HH MM SS MM DD YY HH MM SS: the date/time clock was updated (usually same as the first set of date/time data) 06: various attributes (data len=0x0e) 00 00 00 00 06 MM DD YY HH MM SS 00 00 00 01 02 04 00 d8 01 18 01 00 fe xx 00 00 00 01 02 04 00 d8 01 18 01 00 fe: unknown xx: huffman encoding type (described later) Also, coodinate system needs to be shifted to some extent. NOTE: appears in original CrossPad only. 0a: filename? (data len=0x08) 00 00 00 00 0a MM DD YY HH MM SS xx xx xx xx xx xx xx xx xx ...: ASCII string, "File0001" by default 0d: keyword/title? (data len=0x08) 00 00 00 00 0d MM DD YY HH MM SS xx xx xx xx xx xx xx xx xx ...: ASCII string, "unknown\0" by default 0e: unknown (data len=0x01) 00 00 00 00 0e MM DD YY HH MM SS xx xx: unknown almost always 03 on data from CrossPad almost always 01 on data generated by SDK 1d: various attributes (data len=0x26) 00 00 00 00 06 MM DD YY HH MM SS xx .... xx ...: unknown huffman encoding type (described later) is always "y negated" style. NOTE: appears in CrossPad XP only. 35: ??? (data len=0x00) 00 00 00 00 39 MM DD YY HH MM SS NOTE: appears in CrossPad XP only. 36: bookmark (data len=0x00) 00 00 00 00 36 MM DD YY HH MM SS MM DD YY HH MM SS: the date/time the bookmark was placed to this page 39: ??? (data len=0x21) 00 00 00 00 39 MM DD YY HH MM SS xx ... xx ...: unknown appears only in SDK-generated data 3a: last download date (data len=0x00) 00 00 00 00 3a MM DD YY HH MM SS MM DD YY HH MM SS: the last date/time download ("Upload Ink" in pad menu) was performed 3c: ??? (data len=0x0c) 00 00 00 00 3c MM DD YY HH MM SS xx ... xx ...: unknown appears only in SDK-generated data 3d: ??? (data len=0x0a) 00 00 00 00 3d MM DD YY HH MM SS xx ... xx ...: unknown appears only in SDK-generated data 3e: ??? (data len=0x04) 00 00 00 00 3d MM DD YY HH MM SS xx ... xx ...: unknown appears only in SDK-generated data ** Encoding/decoding stroke data Coordinate system Coodinate system is normal xy plane, Quadrant 3. (by going to right, x will be increased. By going to bottom, y will be increased) Huffman table The following table is the huffman encoding/decoding table used in "version 0" format file. bit string value --- --- 110101011111001 -16 110101011110 -15 1100100100 -14 110010011 -13 110101010 -12 11010100 -11 1000100 -10 1010011 -9 1101011 -8 101000 -7 110011 -6 10000 -5 11000 -4 11011 -3 1011 -2 010 -1 00 0 011 1 1001 2 10101 3 110100 4 100011 5 1100101 6 1010010 7 11001000 8 10001010 9 100010111 10 100010110 11 1100100101 12 11010101110 13 11010101100 14 11010101101 15 1101010111111 16 11010101111101 18 110101011111000xxxxxxxx means "xxxxxxxx" in signed byte 11010101111100000000000 means 0 11111111 termination NOTE: missing values are to be filled. If "huffman encoding type" in segment 06 is 01, the huffman table will be used for both x axis and y axis. If "huffman encoding type" in segment 06 is 02, the huffman table is used for x axis. For y axis, negate the value (i.e. bit string "011" means -1, not 1). Encoding stroke data Assume the following stroke: start from (10, 10), go through (12, 12), (12, 14), (12, 16) In this case, starting point for segment 01 (or 02) will be (10, 10). We have 4 points, including starting point. Movement will be endcoded by the differences between the coordinate. Therefore, we need to encode the following set of numbers: 2 2 0 2 0 2 Convert this into bit string, by using huffman table described above. (here let us assume that huffman encoding type is 01): 1001 1001 00 1001 00 1001 By converting this to hexadecimal value, we'll get: 99 24 90 Resulting huffman encoded data (segment 01) will be: 00 00 00 00 01 MM DD YY HH MM SS 0c 00 0a 00 0a 99 24 90 ff ff 00 04 0c: length of data portion 00 0a 00 0a: starting point is (10, 10) 99 24 90: huffman encoded data ff ff: termination 00 04: we have 4 points, including starting point NOTE: padding rule for the huffman encoded data portion is unknown. zero-fill should be okay. Decoding stroke data Let us try decoding segment 01 with the following data bytes: 00 00 00 00 01 MM DD YY HH MM SS 0c 00 0a 00 0a 99 24 90 ff ff 00 04 From length field (0c), data potion of the segment is: 00 0a 00 0a 99 24 90 ff ff 00 04 Starting point is (10, 10) since we have "00 0a 00 0a" for coordinate. We have 4 points, including starting point. Movement of the pen is described as following hexadecimal values: 99 24 90 Writing this in binary, we get: 1001 1001 0010 0100 1001 0000 By performing longest match against the huffman table, we get: 1001 1001 00 1001 00 1001 00 00 Convert this into movement by using the huffman table: 2 2 0 2 0 2 0 0 As a result, we can understand that the segment 01 means a stroke like: start from (10, 10), go through (12, 12), (12, 14), (12, 16)