gptzeror
provides an R interface to GPTZero API. GPTZero predicts if text was
generated by “AI” like ChatGPT. It splits documents by paragraph and
sentence, allowing for detection when text is partially written by “AI”
and partially by humans.
You can install the development version of gptzeror
from
GitHub with:
# install.packages('remotes')
::install_github('christopherkenny/gptzeror') remotes
Below is an example using the abstract of Kenny, McCartan, Simko, Kuriwaki, and Imai (2023).
<- 'Congressional district lines in many U.S. states are drawn by partisan actors, raising concerns about gerrymandering. To separate the partisan effects of redistricting from the effects of other factors including geography and redistricting rules, we compare possible party compositions of the U.S. House under the enacted plan to those under a set of alternative simulated plans that serve as a non-partisan baseline. We find that partisan gerrymandering is widespread in the 2020 redistricting cycle, but most of the electoral bias it creates cancels at the national level, giving Republicans two additional seats on average. Geography and redistricting rules separately contribute a moderate pro-Republican bias. Finally, we find that partisan gerrymandering reduces electoral competition and makes the partisan composition of the U.S. House less responsive to shifts in the national vote.' abstr
We can pass text directly via
gptzero_predict_text()
.
library(gptzeror)
gptzero_predict_text(abstr)
#> # A tibble: 5 × 10
#> doc_average_generated_prob doc_completely_generated_p…¹ doc_overall_burstiness
#> <dbl> <dbl> <dbl>
#> 1 0.2 0.00228 101.
#> 2 0.2 0.00228 101.
#> 3 0.2 0.00228 101.
#> 4 0.2 0.00228 101.
#> 5 0.2 0.00228 101.
#> # ℹ abbreviated name: ¹doc_completely_generated_prob
#> # ℹ 7 more variables: par_completely_generated_prob <dbl>,
#> # par_num_sentences <int>, par_start_sentence_index <int>,
#> # sentence_index <int>, generated_prob <int>, perplexity <int>,
#> # sentence <chr>
The API also accepts common file types as uploads, including
.txt
, .docx
, and .pdf
. To access
this endpoint, use gptzero_predict_file()
.
<- tempfile(fileext = '.txt')
temp_file cat(abstr, file = temp_file)
gptzero_predict_file(temp_file)
#> # A tibble: 5 × 10
#> doc_average_generated_prob doc_completely_generated_p…¹ doc_overall_burstiness
#> <dbl> <dbl> <dbl>
#> 1 0.2 0.00228 101.
#> 2 0.2 0.00228 101.
#> 3 0.2 0.00228 101.
#> 4 0.2 0.00228 101.
#> 5 0.2 0.00228 101.
#> # ℹ abbreviated name: ¹doc_completely_generated_prob
#> # ℹ 7 more variables: par_completely_generated_prob <dbl>,
#> # par_num_sentences <int>, par_start_sentence_index <int>,
#> # sentence_index <int>, generated_prob <int>, perplexity <int>,
#> # sentence <chr>
Documentation for the GPTZero API is available here.