We've added relative deltas and credible intervals to A/B testing in PostHog. Here's what that means... A relative delta is the percentage change in conversion rate between the control and test variants. So, a bigger delta means a bigger impact for an experiment. The credible interval is more nuanced. Basically, an experiment measures a certain value (like a conversion rate) and the true value isn't *actually* the result that's displayed – it's just an approximation because an experiment only measures a small sample of the population. The credible interval gives you a better look at the true data by showing a likely range for the results, as well as a probability percentage that reflects certainty. Relative deltas are useful for a lot of situations where you want to understand the broad improvement. Credible interval is a more advanced metric that's useful for getting into the nitty-gritty of statistical significance. We've also made it easier to ship the winning variant when your experiment reaches a significant result, via a shiny new 'Make decision' modal. Snazzy!
PostHog’s Post
More Relevant Posts
-
Author of "Statistical Methods in Online A/B Testing" | Founder of Analytics-Toolkit.com | Owner @ Web Focus LLC
❔ What if the observed effect in an A/B test is smaller than the minimum detectable effect (MDE) used in planning it? Is it cause for concern? Does it make a test less trustworthy? Should you require both statistical significance and an observed effect higher than the MDE for a valid experiment? These and other questions surrounding the MDE and its relationship with the observed effect from a test are discussed in detail in the second installment of my three-part series on observed power 👉 https://lnkd.in/dy64RA_i Does one of the issues pointed out in the article stand out to you as the biggest problems for practitioners or clients? Do you think the term itself may be to blame for its multiple misinterpretations? #abtesting #statistics #power #pvalue #hypothesistesting
To view or add a comment, sign in
-
This article by John V. Kane on Medium provides an interesting but often overlooked contrast between Correlation analysis (specifically, Pearson’s pairwise correlation) and regression analysis (specifically, bivariate ordinary least squares (OLS). Major point I drew from the article is that: while correlation and regression coefficients are often seen as being similar, it is possible for two variables to show a strong correlation but actually have a weak overall effect on each other or regression. Have you encountered situations where correlation and regression analysis yielded different insights? How did you handle it? If you would like to know more about this, give it a read by following the link to this article attached to this post. https://lnkd.in/dRekMTiD
To view or add a comment, sign in
-
Given the significant advantages of the Cox PH model, particularly in handling continuous variables, adjusting for covariates, and providing more accurate P-values, it is difficult to justify the continued reliance on the log-rank test. The medical and epidemiological communities should reconsider the widespread use of the log-rank test and prioritize more robust methods like the Cox PH model, which can offer more reliable insights into survival data.
When is Log rank preferred over Univariable Cox regression? As per professor Harrell, No advantage and even disadvantages: - log-rank test does not work for continuous exposures - it does not allow for covariate adjustment - the usual P-value from log-rank may not be as accurate as the likelihood ratio χ2 statistic from Cox PH "For the life of me I don’t know why we still teach log-rank and why it keeps appearing in medical and epidemiologic journals. It has all the assumptions of Cox and more." FH
When is Log rank preferred over Univariable Cox regression?
discourse.datamethods.org
To view or add a comment, sign in
-
One-way ANOVA is a statistical test used to determine if there are significant differences between the means of three or more independent groups. It assesses the impact of a single categorical independent variable on a continuous dependent variable. This test is commonly used in research to compare group means and identify any statistically significant variation.
To view or add a comment, sign in
-
We have deployed test of goodness of probabilistic prediction (https://lnkd.in/e4Cm9H5D). For example, using 6-month forecast period, in Jan we forecast July, in Feb we forecast Aug, etc. Based on one-year record, NVDA turns out to be mostly going up, the realised probability level mostly centres 0.5-1. (https://lnkd.in/eWmrZxet) MMM turns out to be going down, the realised probability level mostly centres 0-0.7. (https://lnkd.in/euhzCH_m) The realised probability level (https://lnkd.in/e9xWfApD) measures how much probability mass in the predicted distribution is below the actual observed value. The prediction was always made 6-month before the actual value is observed, with 6-month forecast period.
To view or add a comment, sign in
-
Full Professor of Biostatistics at Isfahan University of Medical Sciences. Senior Researcher at UPC, Barcelona Tech. Top 2% Scientists Worldwide by Stanford University
The Welch test is a statistical test used in the context of Analysis of Variance (ANOVA) when the assumption of equal variances (homogeneity of variances) across groups is violated. Traditional ANOVA assumes that all groups have the same variance, but when this assumption is not met, the Welch test provides a robust alternative. When to Use the Welch Test? Violation of Homogeneity of Variances: When a preliminary test (such as Levene's test or Bartlett's test) indicates that the variances across groups are significantly different. Unequal Sample Sizes: When the groups being compared have different sample sizes, which can exacerbate the impact of unequal variances. Robustness to Assumption Violations: When there is concern that the assumptions underlying the traditional ANOVA are not fully met, and a more robust test is desired.
To view or add a comment, sign in
-
Which statistical test is used to compare two or more categorical and continuous variables?
To view or add a comment, sign in
-
The chi-square test is a statistical method used to determine if there is a significant association between categorical variables. It compares the observed frequencies in each category to the expected frequencies, which are calculated under the assumption that there is no association. This test is commonly used in hypothesis testing to evaluate the independence of variables in contingency tables and to assess goodness of fit.
To view or add a comment, sign in
-
WordPress Web Dev | Manual Testing | A/B Testing | Node JS | JavaScript | Playwright | Selenium WebdriverIO | CRO | API Testing | UAT | Performance Testing | Security Testing | Product Analyst | Entrepreneur | Innovator
𝗪𝗵𝗮𝘁 𝗜𝘀 ‘𝗖𝗼𝘃𝗮𝗿𝗶𝗮𝘁𝗲 𝗔𝗱𝗷𝘂𝘀𝘁𝗺𝗲𝗻𝘁’ 𝗜𝗻 𝗔/𝗕 𝗧𝗲𝘀𝘁𝗶𝗻𝗴? Covariate adjustment in A/B testing is a statistical technique used to account for pre-existing differences between groups by incorporating additional variables (covariates) into the analysis. This method helps to improve the precision and accuracy of the estimated treatment effects, thereby enhancing the reliability of the test results.
To view or add a comment, sign in
15,573 followers
CT0 @Qred | AWS Community builder | Sec-t(.org) | AWS Stockholm meetup organizer
2moThanks for an awesome product,I just got internal feedback on this when I shared it, "This is great! I was calculating the relative delta manually every time"