Correlation
correlation_matrix(data, l1=None, l2=None, method='pearson')
Warning
If you know that your data has no nulls, you should use np.corrcoef
instead.
While this function will return the correct result and is reasonably fast,
computing the null-aware correlation matrix will always be slower than assuming
that there are no nulls.
Compute the null-aware correlation matrix between two lists of columns. If both
lists are None, then the correlation matrix is over all columns in the input
DataFrame. If l1
is not None, and is a list of 2-tuples, l1
is interpreted
as the combinations of columns to compute the correlation for.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
Union[LazyFrame, DataFrame, ConvertibleToPolars]
|
The input DataFrame. It must be either a Polars Frame or something convertible to a Polars Frame. |
required |
l1 |
Union[list[str], list[tuple[str, str]]]
|
A list of columns to appear as the columns of the correlation matrix, by default None |
None
|
l2 |
list[str]
|
A list of columns to appear as the rows of the correlation matrix, by default None |
None
|
method |
CorrelationMethod
|
How to calculate the correlation, by default "pearson" |
'pearson'
|
Returns:
Type | Description |
---|---|
DataFrame
|
A correlation matrix with |
Source code in python/rapidstats/_corr.py
39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 |
|