---
title: "SemanticDistance_Word_Pairs"
author: "Jamie Reilly, Hannah Mechtenberg, Emily Myers, Jonathan E. Peelle"
date: "`r Sys.Date()`"
vignette: >
%\VignetteIndexEntry{SemanticDistance_Word_Pairs}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
vignetteBuilder: knitr
output:
rmarkdown::html_vignette:
toc: yes
---
```{r, include = FALSE}
knitr::opts_chunk$set(collapse = TRUE, comment = "#>")
```
```{r, message=FALSE, echo=F, warning=F}
# Load SemanticDistance
library(SemanticDistance)
```
# Word Pairs
Sample dataframe included in package. Word pairs are arrayed in columns. Columns need not be immediately adjacent within your dataframe.
```{r, eval=T, message=F, warning=F, echo=F}
knitr::kable(head(Word_Pairs, 5), format = "simple")
```
## Clean Word Pairs in Columns Transcript
Arguments to `clean_paired_cols` are:
`dat` your raw dataframe with two columns of paired text
`word1` quoted variable reflecting the column name where your first word lives
`word2` quoted variable reflecting the column name where your first word lives
`lemmatize` transforms raw word to lemmatized form, T/F default is TRUE
```{r, message=FALSE}
WordPairs_Clean <- clean_paired_cols(dat=Word_Pairs, wordcol1='word1', wordcol2='word2', lemmatize=TRUE)
knitr::kable(head(WordPairs_Clean, 6), format = "simple", digits=2)
```
## Word Pairs Semantic Distance
Generates semantic distances (Glove and SD15) between word pairs in separate columns. Output of 'dist_paired_cols' on 2-column arrayed dataframe. Argument to `dist_paired_cols`:
`dat` = dataframe with word pairs arrayed in columns cleaned and prepped using 'clean_2cols' fn
```{r, message=FALSE}
Columns_Dists <- dist_paired_cols(dat=WordPairs_Clean)
knitr::kable(head(Columns_Dists, 6), format = "simple", digits=2)
```