首页 > 精品范文库 > 5号文库
第五章数据分析(梅长林)习题
编辑:蓝色心情 识别码:14-389129 5号文库 发布时间: 2023-04-12 03:43:36 来源:网络

第五章习题

1.习题5.1

解:假定两总体服从正态分布,且协方差矩阵,误判损失相同又先验概率按比例分配,通过SAS计算得到先验概率如表:

Class

Level

Information

group

Variable

Name

Frequency

Weight

Proportion

Prior

Probability

G1

G1

6.0000

0.428571

0.428571

G2

G2

8.0000

0.571429

0.571429

即:

又计算可得:

有计算的总体协防差距矩阵S为:

Pooled

Within-Class

Covariance

Matrix,DF

=

Variable

x1

x2

x1

1.081944444

-0.310902778

x2

-0.310902778

0.174756944

并且:

计算广义平方距离函数:

并计算后验概率:

回代判别结果如下:

Posterior

Probability

of

Membership

in

group

Obs

From

group

Classified

into

group

G1

G2

G1

G1

0.9387

0.0613

G1

G1

0.9303

0.0697

G1

G1

0.9999

0.0001

G1

G2

*

0.4207

0.5793

G1

G1

0.9893

0.0107

G1

G1

1.0000

0.0000

G2

G2

0.0007

0.9993

G2

G2

0.0026

0.9974

G2

G2

0.0008

0.9992

G2

G2

0.0586

0.9414

G2

G2

0.0350

0.9650

G2

G2

0.0006

0.9994

G2

G2

0.0038

0.9962

G2

G2

0.0012

0.9988

由此可见误判的回代估计:

若按照交叉确认法,定义广义平方距离如下:

逐个剔除,交叉判别,后验概率按下式计算:

通过SAS计算得到表所示结果。发现同样也是属于G1的4号被误判为G2,因此误判率的交叉确认估计为

Posterior

Probability

of

Membership

in

group

Obs

From

group

Classified

into

group

G1

G2

G1

G1

0.9060

0.0940

G1

G1

0.7641

0.2359

G1

G1

1.0000

0.0000

G1

G2

*

0.1950

0.8050

G1

G1

0.9743

0.0257

G1

G1

1.0000

0.0000

G2

G2

0.0012

0.9988

G2

G2

0.0051

0.9949

G2

G2

0.0014

0.9986

G2

G2

0.0713

0.9287

G2

G2

0.0422

0.9578

G2

G2

0.0009

0.9991

G2

G2

0.0059

0.9941

G2

G2

0.0022

0.9978

其中=12.1138,又因为,所以,最后可得后验概率p为:0.048709

习题5.3

解:(1)在并且先验概率相同的的假设前提下,建立矩离判别的线性判别函数。利用SAS的proc

discrim过程首先计算得到总体的协方差矩阵,如表:

Pooled

Within-Class

Covariance

Matrix,DF

=

Variable

x1

x2

x3

x4

x5

x6

x7

x8

x1

2.25705591

-0.91513311

0.34259974

-0.6084399

-0.9576508

-0.8929719

-0.0539445

-0.2192724

x2

-0.9151331

25.2318255

-0.3390873

-2.5515272

-5.0966371

0.78571637

-0.0835586

4.37529806

x3

0.34259974

-0.33908734

3.30063123

1.42276017

1.78692343

0.40208409

-0.0676655

-0.0732213

x4

-0.6084399

-2.55152726

1.42276017

6.07845863

5.78100857

2.32039331

-0.3205116

0.48605897

x5

-0.9576508

-5.09663714

1.78692343

5.78100857

8.15854743

3.44983429

-0.1096651

0.08904743

x6

-0.8929719

0.78571637

0.40208409

2.32039331

3.44983429

4.16657066

-0.2236278

0.87862549

x7

-0.0539445

-0.08355869

-0.0676655

-0.3205116

-0.1096651

-0.2236278

0.26009291

-0.0767347

x8

-0.2192724

4.37529806

-0.0732213

0.48605897

0.08904743

0.87862549

-0.0767347

2.51054423

各个总体的马氏平方距离见表:

Generalized

Squared

Distance

to

group

From

group

G1

G2

G1

0

24.61468

G2

24.61468

0

线性判别函数为:

得到训练样本回判法判别结果如表:

Error

Count

Estimates

for

group

G1

G2

Total

Rate

0.0000

0.0000

0.0000

Priors

0.5000

0.5000

训练样本的交叉确认判别结果:

Posterior

Probability

of

Membership

in

group

Obs

From

group

Classified

into

group

G1

G2

G1

G2

*

0.4501

0.5499

G1

G2

*

0.0920

0.9080

Error

Count

Estimates

for

group

G1

G2

Total

Rate

0.1000

0.0000

0.0500

Priors

0.5000

0.5000

(2)假设两总体服从正态分布,先验概率按比例分配且误判损失相同,在两总体协方差矩阵相同,即的条件下进行Bayes判别分析,通过SAS

discrim过程得到结果:

Error

Count

Estimates

for

group

G1

G2

Total

Rate

0.0000

0.0000

0.0000

Priors

0.7407

0.2593

交叉确认判别结果:

Posterior

Probability

of

Membership

in

group

Obs

From

group

Classified

into

group

G1

G2

G1

G2

*

0.2246

0.7754

G2

G1

*

0.5282

0.4718

Error

Count

Estimates

for

group

G1

G2

Total

Rate

0.0500

0.1429

0.0741

Priors

0.7407

0.2593

在,并且先验概率按比例分配的假设前提下利用SAS的proc

discrim过程进行Bays判别分析,这时以个总体的训练样本单独估计各总体的协方差矩阵,可到的训练样本的回判和交叉确认结果:

回判结果:

Error

Count

Estimates

for

group

G1

G2

Total

Rate

0.0000

0.0000

0.0000

Priors

0.7407

0.2593

交叉确认判别结果:

Posterior

Probability

of

Membership

in

group

Obs

From

group

Classified

into

group

G1

G2

G2

G1

*

1.0000

0.0000

G2

G1

*

1.0000

0.0000

G2

G1

*

1.0000

0.0000

G2

G1

*

1.0000

0.0000

G2

G1

*

1.0000

0.0000

G2

G1

*

1.0000

0.0000

G2

G1

*

1.0000

0.0000

Error

Count

Estimates

for

group

G1

G2

Total

Rate

0.0000

1.0000

0.2593

Priors

0.7407

0.2593

(3)在不同的假设前提,采用不同判别方法得到待判样本的判别结果:

1.距离判别分析得到西藏、上海、广东的判别结果:

Posterior

Probability

of

Membership

in

group

Obs

Classified

into

group

G1

G2

G2

0.0000

1.0000

G2

0.0000

1.0000

G2

0.0000

1.0000

2.在协方差矩阵相同的前提下,Bayes对西藏、上海、广东的判别结果:

Posterior

Probability

of

Membership

in

group

Obs

Classified

into

group

G1

G2

G2

0.0000

1.0000

G2

0.0000

1.0000

G2

0.0000

1.0000

3在协方差不同矩阵相同的前提下,Bayes对西藏、上海、广东的判别结果:

Posterior

Probability

of

Membership

in

group

Obs

Classified

into

group

G1

G2

G1

1.0000

0.0000

G1

1.0000

0.0000

G1

1.0000

0.0000

3.习题5.4

解:(1)假设两总体服从正态分布且在两总体协方差矩阵相同,即,先验概率按相同的条件下进行Bayes判别分析,通过SAS

discrim过程得到结果:

首先得到线性判别函数:

回代误判结果:

Posterior

Probability

of

Membership

in

group

Obs

From

group

Classified

into

group

G1

G2

G1

G2

*

0.3401

0.6599

G2

G1

*

0.8571

0.1429

由计算结果发现,第9号样本被误判到G2,29号样本被误判到G1.误判率为6.34%

Error

Count

Estimates

for

group

G1

G2

Total

Rate

0.0833

0.0435

0.0634

Priors

0.5000

0.5000

交叉确认判别结果:由计算发现总共有四个样本被判错,分别是9、28、29、35号样品。累计误判率为10.69%

Posterior

Probability

of

Membership

in

group

Obs

From

group

Classified

into

group

G1

G2

G1

G2

*

0.0973

0.9027

G2

G1

*

0.6130

0.3870

G2

G1

*

0.9643

0.0357

G2

G1

*

0.8470

0.1530

Error

Count

Estimates

for

group

G1

G2

Total

Rate

0.0833

0.1304

0.1069

Priors

0.5000

0.5000

(1)假设两总体服从正态分布且在两总体协方差矩阵相同,即,先验概率按比例分配且误判损失相同的条件下进行Bayes判别分析,通过SAS

discrim过程得到结果:

首先得到线性判别函数:

Linear

Discriminant

Function

for

group

Variable

G1

G2

Constant

-99.91796

-95.41991

x1

30.35060

29.87680

x2

-0.15214

-0.15210

x3

-0.78868

-0.22662

x4

1.95176

1.39528

x5

0.58964

0.06490

x6

-108.10195

-85.33735

x7

-0.31156

-0.25957

回代误判结果

Posterior

Probability

of

Membership

in

group

Obs

From

group

Classified

into

group

G1

G2

G1

G2

*

0.2119

0.7881

G2

G1

*

0.7579

0.2421

Error

Count

Estimates

for

group

G1

G2

Total

Rate

0.0833

0.0435

0.0571

Priors

0.3429

0.6571

交叉确认误判结果:

Posterior

Probability

of

Membership

in

group

Obs

From

group

Classified

into

group

G1

G2

G1

G2

*

0.3436

0.6564

G1

G2

*

0.0532

0.9468

G1

G2

*

0.4052

0.5948

G1

G2

*

0.3519

0.6481

G2

G1

*

0.9338

0.0662

G2

G1

*

0.7428

0.2572

Error

Count

Estimates

for

group

G1

G2

Total

Rate

0.3333

0.0870

0.1714

Priors

0.3429

0.6571

第五章数据分析(梅长林)习题
TOP