Randomness Tests for Circular Data

Introduction

The GTRT package aims to provide an efficient and user-friendly framework to conduct the randomness tests for linear and circular data as proposed in Gehlot and Laha (2025a) and Gehlot and Laha (2025b), respectively. This vignette is designed to explain the functioning of randomness tests for circular data. You can load the package as follows.

library(GTRT)

In circular (directional) statistics, data is expressed as angles or directions, which can be represented as points on a unit circle by specifying an initial (zero) direction and an orientation. While the terms randomness and uniformity are often used interchangeably in circular statistics, they are not equivalent. As a result, a test of circular uniformity is not suitable for assessing randomness in circular data. Gehlot and Laha (2025b) present two novel randomness tests for circular data based on Random Circular Arc Graphs (RCAGs), demonstrating high accuracy and strong applicability in real-world scenarios.

These tests leverage two key properties of RCAGs: edge probability and vertex degree distribution. Using these properties, the authors develop two tests of randomness:

RCAG-Edge Probability (RCAG-EP)
RCAG-Degree Distribution (RCAG-DD)

This vignette describes the working of both tests in detail.

RCAG-EP Test

Let \(G\) be an RCAG formed using the given observations and \(p\) be the probability that an edge between two randomly chosen vertices in \(G\) does not exist. If the observations are mutually independent, then \(p=\frac{1}{6}\). Thus, we test \(H_0 : p=\frac{1}{6}\) against the alternative \(H_1: p \neq \frac{1}{6}\), to test for randomness. Let \(\hat{p}\) be the proportion of pairs whose vertices are not joined by an edge, i.e., the proportion of non-intersecting pairs (nip).

The nip.rcag() function calculates the value of \(\hat{p}\) for a given set of observations. It takes following parameters:

s - Start points of arcs (in radian or degree)
t - End points of arcs (in radian or degree)
e1 - Vector of indices for the first arc in each pair.
e2 - Vector of indices for the second arc in each pair.

s <- circular::rcircularuniform(10) # Starting points of 10 arcs
t <- circular::rcircularuniform(10) # End points of arcs
e1 <- c(2,10,6,1,5) # Indices for the first arc in 5 pairs formed unsing above 10 arcs.
e2 <- c(4,3,8,7,9) # Indices for the second arc in 5 pairs formed unsing above 10 arcs.
nip.rcag(s,t,e1,e2)

The rcagep.test() function takes a vector \(\vec{\theta}\) of angular observations (in radian or degree) as input and performs the RCAG-EP randomness test on it. It returns the prob. of non-intersection (\(\hat{p}\)), cutoff \(C\) for the value of \(|\hat{p}-\frac{1}{6}|\) to reject the null hypothesis of randomness at the level of significance \(\alpha\) when \(|\hat{p}-\frac{1}{6}|\)>C and adjusted p-values obtained using Benjamini-Hochberg correction for multiple testing.

x <- arima.sim(model = list(ar=0.9), 1000) ## AR(1) model
theta <- ((2*atan(x))%%(2*pi)) ##LAR(1) model
rcagep.test(theta,0.05)

## 
## RCAG-EP Test
## 
## data: theta
## prob. of non-intersection: 0.184 
## Adj p-values: 0.4621 
## Cutoff: 0.0462

RCAG-DD Test

Let \(G\) be an RCAG formed using the given observations, \(\hat{F_n}\) be the empirical vertex degree distribution of this graph \(G\) and \(F^*\) be the theoretical vertex degree distribution of RCAG with \(n\) vertices.

The cdf.rcag() function takes the number of observations (\(m\)) as its parameter and calculates the theoretical vertex degree distribution (\(F^*\)) of RCAG with \(n\) vertices where \(m=2n\) or \(m=2n+1\).

cdf.rcag(1000)

The deg.rcag() function calculates the degrees of each vertex of the RCAG \(G\) obatined using given observations.

x <- arima.sim(model = list(ar=0.9), 1000) ## AR(1) model
theta <- ((2*atan(x))%%(2*pi)) ##LAR(1) model
deg.rcag(theta)

The rcagdd.test() function takes a vector \(\vec{\theta}\) of angular observations (in radian or degree) and performs the RCAG-DD randomness test. It computes the Hellinger distance between \(\hat{F_n}\) and \(F^*\) as the test statistic, as described in Gehlot and Laha (2025b). The function returns the value(s) of the test statistic and rejects the null hypothesis of randomness if any of the test statistics takes value greater than \(C_\alpha\), where \(C_\alpha\) is the threshold at the level of significance \(\alpha\), obtained using the thrsd.rcagdd() function.

x <- arima.sim(model = list(ar=c(0.6,0.3)), 1000) ## AR(2) model
theta <- ((2*atan(x))%%(2*pi))*(180/pi) ##LAR(2) model
rcagdd.test(theta)

## 
## RCAG-DD Test
## 
## Statistic = 0.4863 
## Reject null hypothesis of randomness if the value(s) of any of the test statistic > C.
##     Calculate C using thrsd.rcagdd() function.

The thrsd.rcagdd() function calculates the value of threshold \(C_\alpha\) for the RCAG-DD test at the level of significance \(\alpha\) using simulations. It takes parameters:

m - number of observations
n_iter - number of simulation iterations
alpha - level of significance

A table for threshold values (\(C_\alpha\)) of the RCAG-DD test at level of significance \(\alpha\) for various sample sizes \(m\) can be found in Gehlot and Laha (2025b).

thrsd.rcagdd(500,1000,0.05)

Real World Example

We use the wind dataset from the circular package (Agostinelli and Lund, 2022), which contains wind direction recorded every 15 minutes from January 29, 2001 to March 31, 2001 from 3.00am to 4.00am at Col de la Roa in the Italian Alps and apply both RCAG-EP and RCAG-DD tests to it.

library(circular)

## Warning: package 'circular' was built under R version 4.1.3

## 
## Attaching package: 'circular'

## The following objects are masked from 'package:stats':
## 
##     sd, var

data(wind)
theta <- wind
rcagep.test(theta,0.05)

## 
## RCAG-EP Test
## 
## data: theta
## prob. of non-intersection: 0.181818181818182, 0.12987012987013, 0.181818181818182 
## Adj p-values: 0.72128, 0.72128, 0.72128 
## Cutoff: 0.08324

The output of the rcagep.test() includes the estimated probability of non-intersection (\(\hat{p}\)), the cutoff value \(C\) for rejecting the null hypothesis of randomness when \(|\hat{p} - \tfrac{1}{6}| > C\), and the corresponding adjusted p-values. For a detailed explanation, see Section RCAG-EP Test.

rcagdd.test(theta)

## 
## RCAG-DD Test
## 
## Statistic = 0.40663 
## Reject null hypothesis of randomness if the value(s) of any of the test statistic > C.
##     Calculate C using thrsd.rcagdd() function.

The output of the rcagdd.test() includes the value(s) of the test statistic and the corresponding decision rule for rejecting the null hypothesis of randomness. For a detailed explanation, see Section RCAG-DD Test.

thrsd.rcagdd(length(theta),1000,c(0.05,0.01))

##       95%       99% 
## 0.4384502 0.4543536

Randomness Tests for Circular Data

Shriya Gehlot and Arnab Kumar Laha

2025-08-25

Introduction

RCAG-EP Test

RCAG-DD Test

Real World Example

References