R/join_closest.R
genome_join_closest.Rd
Join intervals on chromosomes in data frames, to the closest partner
genome_join_closest(x, y, by = NULL, mode = "inner", distance_column_name = NULL, max_distance = Inf, select = "all") genome_inner_join_closest(x, y, by = NULL, ...) genome_left_join_closest(x, y, by = NULL, ...) genome_right_join_closest(x, y, by = NULL, ...) genome_full_join_closest(x, y, by = NULL, ...) genome_semi_join_closest(x, y, by = NULL, ...) genome_anti_join_closest(x, y, by = NULL, ...)
x | A dataframe. |
---|---|
y | A dataframe. |
by | A character vector with 3 entries which are used to match the chromosome, start and end column.
For example: |
mode | One of "inner", "full", "left", "right", "semi" or "anti". |
distance_column_name | A string that is used as the new column name with the distance.
If |
max_distance | The maximum distance that is allowed to join 2 entries. |
select | A string that is passed on to |
... | Additional arguments parsed on to genome_join_closest. |
The joined dataframe of x
and y
.
library(dplyr) x1 <- data.frame(id = 1:4, bla=letters[1:4], chromosome = c("chr1", "chr1", "chr2", "chr2"), start = c(100, 200, 300, 400), end = c(150, 250, 350, 450)) x2 <- data.frame(id = 1:4, BLA=LETTERS[1:4], chromosome = c("chr1", "chr2", "chr2", "chr1"), start = c(140, 210, 400, 300), end = c(160, 240, 415, 320)) j <- genome_intersect(x1, x2, by=c("chromosome", "start", "end"), mode="both") print(j)#> id.x bla chromosome id.y BLA start end #> 1 1 a chr1 1 A 140 150 #> 2 4 d chr2 3 C 400 415