林深见鹿（六）：概率论与数理统计（5）

摘要：Today, the editor brings the "Deep in the Woods, the Deer Appears (Part 2): Probability Theory and Mathematical Statistics (1)".

分享兴趣，传播快乐，增长见闻，留下美好！

亲爱的您，这里是LearningYard新学苑。

今天小编为大家带来

“林深见鹿（六）：概率论与数理统计（5）”。

欢迎您的访问！

Share interest, spread happiness, increase knowledge, and leave beautiful.

Dear, this is the LearingYard New Academy!

Today, the editor brings the "Deep in the Woods, the Deer Appears (Part 2): Probability Theory and Mathematical Statistics (1)".

Welcome to visit!

思维导图

MindMapping

在现实世界中，许多随机现象需要同时用多个随机变量来描述。比如研究某地区成年人的身体状况，我们需要同时考虑身高和体重这两个随机变量；分析金融市场时，我们需要同时关注股票价格和交易量。这种需要多个随机变量共同描述的现象，就是多维随机变量研究的范畴。

In the real world, many random phenomena need to be described simultaneously using multiple random variables. For example, when studying the physical condition of adults in a certain region, we need to consider the two random variables of height and weight simultaneously; when analyzing financial markets, we need to pay attention to both stock prices and trading volumes. Such phenomena that require multiple random variables for joint description fall within the scope of the study of multidimensional random variables.

二维随机变量是最常见的多维情形，记作(X,Y)。与一维随机变量不同，二维随机变量不仅关注每个变量自身的性质，更着重研究两个变量之间的相互关系。描述二维随机变量概率分布的方式有两种：对于离散型，我们使用联合分布律，列出所有可能取值组合的概率；对于连续型，我们使用联合密度函数，通过二重积分计算概率。

Two-dimensional random variables, denoted as (X, Y), represent the most common multidimensional case. Unlike one-dimensional random variables, two-dimensional random variables not only focus on the properties of each individual variable but also emphasize the study of the interrelationship between the two variables. There are two ways to describe the probability distribution of two-dimensional random variables: for discrete types, we use the joint distribution law, listing the probabilities of all possible value combinations; for continuous types, we use the joint densityfunction to calculate probabilities through double integration.

当我们只关心其中一个变量的分布时，就需要引入边缘分布的概念。边缘分布本质上就是从联合分布中"忽略"另一个变量，获得单个变量的分布。对于离散型，通过对另一个变量的所有可能取值求和得到；对于连续型，则通过对另一个变量积分获得。边缘分布让我们能够在复杂联合关系中聚焦单个变量的行为特征。

When we are only concerned with the distribution of one of the variables, we need to introduce the concept of marginal distribution. Essentially, marginal distribution involves "ignoring" the other variable in the joint distribution to obtain the distribution of a single variable. For discrete cases, it is obtained by summing over all possible values of the other variable; for continuous cases, it is obtained by integrating with respect to the other variable. Marginal distribution allows us to focus on the behavioral characteristics of a single variable amidst complex joint relationships.

条件分布则揭示了另一个维度的信息：在已知一个变量取某个值的条件下，另一个变量的分布规律。比如在已知身高的条件下，体重的分布情况。条件分布的概念极大丰富了我们对变量间关系的理解，为统计推断和预测提供了重要工具。条件分布与边缘分布的关系通过乘法公式相连，构成了概率论中的基础关系网。

Conditional distribution reveals information from another perspective: it describes the distribution pattern of one variable given that the other variable takes on a specific value. For instance, the distribution of weight given a known height. The concept of conditional distribution significantly enriches our understanding of the relationships between variables and provides a crucial tool for statistical inference and prediction. The relationship between conditional distribution and marginal distribution is connected through the multiplication formula, forming a fundamental network of relationships in probability theory.

随机变量的独立性是一个核心概念。如果两个随机变量相互独立，意味着一个变量的取值不会影响另一个变量的分布规律。用数学语言表达，就是联合分布等于边缘分布的乘积。独立性大大简化了多维随机变量的分析难度，但在实际应用中，严格独立的变量很少见，我们需要通过统计方法来检验变量是否独立。

The independence of random variables is a core concept. If two random variables are independent, it means that the value taken by one variable does not affect the distribution pattern of the other variable. In mathematical terms, this means that the joint distribution is equal to the product of the marginal distributions. Independence significantly simplifies the analysis of multidimensional random variables. However, in practical applications, strictly independent variables are rare, and we need to use statistical methods to test whether variables are independent.

两个随机变量的函数的分布是实际问题中经常遇到的挑战。常见的函数形式包括和、差、积、商以及最大值、最小值等。求函数分布的方法主要有分布函数法和卷积公式法。分布函数法通用性强，但计算复杂；卷积公式法适用于特定形式，但计算简洁。掌握这些方法需要大量练习，特别是确定积分区域的技巧。

The distribution of functions of two random variables presents a frequently encountered challenge in practical problems. Common functional forms include sums, differences, products, quotients, as well as maximum and minimum values, among others. The primary methods for determining the distribution of such functions are the distribution function method and the convolution formula method. The distribution function method is highly versatile but computationally complex; in contrast, the convolution formula method is applicable to specific forms and offers computational simplicity. Mastering these methods requires extensive practice, particularly in the skill of determining integration regions.

多维随机变量的理论为现代统计学和机器学习奠定了坚实基础。在回归分析中，我们研究因变量与自变量的联合分布；在主成分分析中，我们寻找描述多个变量联合变异的新维度；在贝叶斯网络中，我们建模多个随机变量的复杂依赖关系。这些应用都建立在多维随机变量的理论框架之上。

The theory of multidimensional random variables has laid a solid foundation for modern statistics and machine learning. In regression analysis, we investigate the joint distribution between dependent and independent variables; in principal component analysis, we seek new dimensions that describe the joint variation of multiple variables; in Bayesian networks, we model the complex dependency relationships among multiple random variables. All these applications are built upon the theoretical framework of multidimensional random variables.

今天的分享就到这里了，

如果您对文章有独特的想法，

欢迎给我们留言。

让我们相约明天，

祝您今天过得开心快乐！

That's all for today's sharing.

If you have a unique idea about the article,

please leave us a message,

and let us meet tomorrow.

I wish you a nice day!

翻译：文心一言

参考资料：百度百科