Tidying TLEs in R

I’ve been working with the Union of Concerned Scientists’ data on active satellites for a while now, and decided it was time to add Space-Track’s debris data to it. The UCS data is nice to work with because it’s already tidy: one row per observation, one column per variable. One format for debris data is Two-Line Elements (TLEs) (Space-Track description).

TLEs are apparently great for orbital propagation and stuff, I don’t know, I’m not an aerospace engineer. I’ve seen some stuff about working with TLEs for propagators in MATLAB or Python, but nothing about (a) working with them in R or (b) tidying a collection of TLEs in any program. This post documents a solution I hacked together, mixing an ugly bit of base R with a neat bit of tidyverse. The tidyverse code was adapted from a post by Matthew Lincoln about tidying a crazy single-column table with readr, dplyr, and tidyr. Since I’m reading the data in with read.csv() from base, I’m not using readr.

This process also isn’t necessary. Space-Track makes json formatted data available, and read_json() from jsonlite handles those nicely. This is a “doing things for the sake of it” kind of post.

So. Supposing you’ve gotten your TLEs downloaded into a nice text file, the first step is to read it into R. The TLEs I’m interested in are for currently-tracked objects in LEO, which comes out to a file with 8067 rows and 1 column (API query, at least as of when I wrote this).

library(dplyr)
library(tidyr)

# read in TLEs
options(stringsAsFactors=FALSE) # it's more convenient to work with character strings rather than factors
leo_3le <- read.csv("leo_3le.txt",header=FALSE) # first row is not a column name

This is entirely a statement about me and not the format: TLEs look weird as hell.

> dim(leo_3le)
[1] 8067    1

> head(leo_3le)
                                                                     V1
1                                                          0 VANGUARD 2
2 1 00011U 59001A   18194.13149990  .00000063  00000-0  13264-4 0  9995
3 2 00011  32.8728 183.1765 1468204 230.1854 116.0167 11.85536077532985
4                                                          0 VANGUARD 3
5 1    20U 59007A   18193.39885059 +.00000024 +00000-0 +12986-4 0  9998
6 2    20 033.3397 177.1390 1667296 121.6835 255.7164 11.55589562148902

Lines 1:3 represent a single object. Lines 4:6 represent another object. It’s an annoying format for the things I want to do.

Ok, now the ugly hack stuff: I’m going to select every third row using vectors with ones in appropriate spots, relabel zeros to NAs, drop the NAs, then recombine them all into a single dataframe.

# the ugly hack: rearranging the rows with pseudo matrix math. first, I select the indices for pieces that are line 0 (names), line 1 (a set of parameters), and line 2 (another set of parameters)
rownums <- as.numeric(row.names(leo_3le)) # make sure the row numbers are numeric and not characters - probably unnecessary
tle_names_idx <- rownums*rep(c(1,0,0),length.out=dim(leo_3le)[1])
tle_1line_idx <- rownums*rep(c(0,1,0),length.out=dim(leo_3le)[1])
tle_2line_idx <- rownums*rep(c(0,0,1),length.out=dim(leo_3le)[1])
# rename the zeroed observations to NA so they're easy to drop
tle_names_idx[tle_names_idx==0] <- NA
tle_1line_idx[tle_1line_idx==0] <- NA
tle_2line_idx[tle_2line_idx==0] <- NA
# now drop the NAs
tle_names_idx <- tle_names_idx[!is.na(tle_names_idx)]
tle_1line_idx <- tle_1line_idx[!is.na(tle_1line_idx)]
tle_2line_idx <- tle_2line_idx[!is.na(tle_2line_idx)]
# recombine everything into a dataframe
leo_3le_dfrm <- data.frame(sat.name = leo_3le[tle_names_idx,1], 
						   line1 = leo_3le[tle_1line_idx,1],
						   line2 = leo_3le[tle_2line_idx,1])

This leaves me with a 2689-row 3-column dataframe. The first column has the satellite name (line 0 of the TLE), the second column has the first set of parameters (line 1 of the TLE), and the third column has the second set of parameters (line 2 of the TLE). There’s probably a way to do this in tidyverse.

> dim(leo_3le_dfrm)
[1] 2689    3

> head(leo_3le_dfrm)
           sat.name
1      0 VANGUARD 2
2      0 VANGUARD 3
3      0 EXPLORER 7
4         0 TIROS 1
5      0 TRANSIT 2A
6 0 SOLRAD 1 (GREB)
                                                                  line1
1 1 00011U 59001A   18194.13149990  .00000063  00000-0  13264-4 0  9995
2 1    20U 59007A   18193.39885059 +.00000024 +00000-0 +12986-4 0  9998
3 1 00022U 59009A   18193.85420323  .00000017  00000-0  25009-4 0  9994
4 1 00029U 60002B   18194.55070431 -.00000135  00000-0  10965-4 0  9992
5 1 00045U 60007A   18194.14978757 -.00000039  00000-0  17108-4 0  9997
6 1 00046U 60007B   18194.29883214 -.00000041  00000-0  15170-4 0  9997
                                                                  line2
1 2 00011  32.8728 183.1765 1468204 230.1854 116.0167 11.85536077532985
2 2    20 033.3397 177.1390 1667296 121.6835 255.7164 11.55589562148902
3 2 00022  50.2826 171.3798 0140753  83.9529 277.7437 14.94580679119776
4 2 00029  48.3805  35.0120 0023682  97.0974 263.2631 14.74254161114643
5 2 00045  66.6952  79.4545 0248473 109.6184 253.1917 14.33604371 21605
6 2 00046  66.6897 151.6848 0217505 295.8620  62.0202 14.49157393 36927

The tidyverse functions are the prettiest part of this. I create new objects to hold the modified vectors (just a personal tic), then run two pipes to do the cleaning. The first pipe splits the strings at the appropriate character numbers. Why not just whitespace, you ask? Apparently there can be whitespaces in some of the orbital elements ¯\(ツ)/¯ (Element Set Epoch, columns 19-32 of line 1). The second trims leading and trailing whitespace. Finally, I recombine everything into a dataframe. There are tidy ways to do this, but I like using base R for this.

# make separate objects for the first and second line elements
line1_col <- data_frame(text = leo_3le_dfrm[,2])
line2_col <- data_frame(text = leo_3le_dfrm[,3])

# the beautiful tidying: split the dataframe where there are variables and trim whitespace
leo_3le_dfrm_line1 <- line1_col %>% 
						# split the strings
						separate(text, into=c("line.num1","catalog.number","elset.class","intl.des","epoch","mean.motion.deriv.1","mean.motion.deriv.2","b.drag","elset.type","elset.num","checksum"), sep=c(1,7,8,17,32,43,52,61,63,68)) %>%
						# trim whitespace
						mutate_at(.funs=str_trim, .vars=vars(line.num1:checksum))
leo_3le_dfrm_line2 <- line2_col %>% 
						# split the strings
						separate(text, into=c("line.num2","catalog.number.2","inclination","raan.deg","eccentricity","aop","mean.anomaly.deg","mean.motion","rev.num.epoch","checksum"), sep=c(1,7,16,25,33,42,51,63,68))  %>%
						# trim whitespace
					    mutate_at(.funs=str_trim, .vars=vars(line.num2:checksum))
leo_3le_dfrm <- as.data.frame(cbind(sat.name=leo_3le_dfrm$sat.name,
									leo_3le_dfrm_line1, 
									leo_3le_dfrm_line2))

The end result is a 2689-row 22-column tidy dataframe of orbital parameters which can be merged with other tidy datasets and used for all kinds of other analysis:

> dim(leo_3le_dfrm)
[1] 2689   22

> head(leo_3le_dfrm)
           sat.name line.num1 catalog.number elset.class intl.des
1      0 VANGUARD 2         1          00011           U   59001A
2      0 VANGUARD 3         1             20           U   59007A
3      0 EXPLORER 7         1          00022           U   59009A
4         0 TIROS 1         1          00029           U   60002B
5      0 TRANSIT 2A         1          00045           U   60007A
6 0 SOLRAD 1 (GREB)         1          00046           U   60007B
           epoch mean.motion.deriv.1 mean.motion.deriv.2   b.drag elset.type
1 18194.13149990           .00000063             00000-0  13264-4          0
2 18193.39885059          +.00000024            +00000-0 +12986-4          0
3 18193.85420323           .00000017             00000-0  25009-4          0
4 18194.55070431          -.00000135             00000-0  10965-4          0
5 18194.14978757          -.00000039             00000-0  17108-4          0
6 18194.29883214          -.00000041             00000-0  15170-4          0
  elset.num checksum line.num2 catalog.number.2 inclination raan.deg
1       999        5         2            00011     32.8728 183.1765
2       999        8         2               20    033.3397 177.1390
3       999        4         2            00022     50.2826 171.3798
4       999        2         2            00029     48.3805  35.0120
5       999        7         2            00045     66.6952  79.4545
6       999        7         2            00046     66.6897 151.6848
  eccentricity      aop mean.anomaly.deg mean.motion rev.num.epoch checksum
1      1468204 230.1854         116.0167 11.85536077         53298        5
2      1667296 121.6835         255.7164 11.55589562         14890        2
3      0140753  83.9529         277.7437 14.94580679         11977        6
4      0023682  97.0974         263.2631 14.74254161         11464        3
5      0248473 109.6184         253.1917 14.33604371          2160        5
6      0217505 295.8620          62.0202 14.49157393          3692        7

It still needs to be cleaned a bit - those + signs in the mean motion derivatives are annoying, I don’t need the line number or checksum columns, and I want to get rid of the leading 0 and whitespace in sat.name - but this is good enough for now.

comments powered by Disqus