Tài liệu Wlan fingerprinting based indoor positioning in the presence of dropped mixture data  Vu Trung Kien: Research
Journal of Military Science and Technology, Special Issue, No.57A, 11  2018 25
WLAN FINGERPRINTING BASED INDOOR POSITIONING IN
THE PRESENCE OF DROPPED MIXTURE DATA
Vu Trung Kien1,*, Hoang Manh Kha1, Le Hung Lan2
Abstract: In the Wireless Local Area Network (WLAN), due to the unexpected
operation of equipments and the changing of surround environment, the dropping
and multicomponent problems might present in the observed data. Dropping refers
to the fact that occasionally Received Signal Strength Indication (RSSI)
measurements of WiFi access points (AP) are not available, although their value is
clearly above the limited sensitivity of WiFi sensors on portable devices. The multi
component problem occurs when the measured data varies due to obstacles as well
as user directions, door close or open, etc.. Taken these problems into
consideration, this paper proposes to model the RSSI distribution by the dropping
Gaussian Mixture Model (GMM) and develo...
10 trang 
Chia sẻ: quangot475  Ngày: 12/01/2021  Lượt xem: 8  Lượt tải: 0
Bạn đang xem nội dung tài liệu Wlan fingerprinting based indoor positioning in the presence of dropped mixture data  Vu Trung Kien, để tải tài liệu về máy bạn click vào nút DOWNLOAD ở trên
Research
Journal of Military Science and Technology, Special Issue, No.57A, 11  2018 25
WLAN FINGERPRINTING BASED INDOOR POSITIONING IN
THE PRESENCE OF DROPPED MIXTURE DATA
Vu Trung Kien1,*, Hoang Manh Kha1, Le Hung Lan2
Abstract: In the Wireless Local Area Network (WLAN), due to the unexpected
operation of equipments and the changing of surround environment, the dropping
and multicomponent problems might present in the observed data. Dropping refers
to the fact that occasionally Received Signal Strength Indication (RSSI)
measurements of WiFi access points (AP) are not available, although their value is
clearly above the limited sensitivity of WiFi sensors on portable devices. The multi
component problem occurs when the measured data varies due to obstacles as well
as user directions, door close or open, etc.. Taken these problems into
consideration, this paper proposes to model the RSSI distribution by the dropping
Gaussian Mixture Model (GMM) and develop an extended version of the
ExpectationMaximization (EM) algorithm to estimate parameters of such a model
in the training phase of the WLAN fingerprinting based indoor positioning systems
(IPS). Simulation results demonstrate the effectiveness of proposed method.
Keywords: Indoor positioning, Fingerprinting, EM algorithm, Dropping, Gaussian mixture model.
1. INTRODUCTION
WLAN fingerprinting based IPSs: With the popularity of Wireless Local
Area Networks (WLAN), WiFi based indoor positioning techniques are widely
used for indoor user localization. Most popular WiFi positioning methods are to
make use of the Received Signal Strength Indication (RSSI). Among those, the
fingerprinting based method is most suitable for the complex indoor environment
because a line of sight between transmitter and receiver is not required [1]. This
method estimates the position of an object which relies on training data from a set
of reference points (RP) with known locations, including two phases: the training
phase and the classification (online positioning) phase. In the training phase,
training data (values of RSSI) are collected at RPs from WiFi access points (AP)
and used to build the training database which is often called radio map. In the
online positioning phase, the target’s position is estimated by computing the
similarity between online observations and the radio map. Within RSSI
fingerprinting based positioning methods, there are two common approaches to
estimate user location: deterministic approaches [2, 7], and stochastic approaches
[9÷11] which use a probabilistic model of the training data and compute the
likelihood of observing the online measurements given the position dependent
probability density function (PDF), or the posterior of being at a position given the
online observations, to come up with the position estimate. The stochastic
approaches seem to be able to efficiently cope with the variations in observed data
in the training as well as the classification procedure. In these approaches, the radio
map stores statistical parameters of RSSI distributions of all APs at all training
positions instead of raw RSSI measurements [10, 11]. Therefore, the accuracy of
positioning results highly depend on the accuracy of the estimated parameters.
Modeling RSSI distribution: There are two common categories of model of
the distribution of RSSI values: parametric and nonparametric models. As
Electronics and Automation
V. T. Kien, H. M. Kha, L. H. Lan, “ WLAN fingerprinting based dropped mixture data.” 26
reported in [2÷4], systems employed in the parametric model outperform the non
parametric model. Most studies showed that the majority of RSSI histograms fitted
very well with the Gaussian distribution if sufficient samples have been collected
[2], [12÷16]. Therefore, the Gaussian model is the most feasible parametric model
for modeling WiFi RSSI data.
In this work, the phenomenon in the measured WiFi RSSI data has been carried
out. In [3, 4, 17], authors have recognized the dropping problem in the observed
data. The dropping problem refers to the fact that occasionally RSSI measurements
return to the limited sensitivity of the WiFi sensor, although the portable device
(e.g. a smart phone) used to measure RSSI is close enough to the WiFi AP.
Dropping might be due to reasons such as: the limitation of the WiFi chipset
driver, that is, limited buffer sizes or timeouts; the temporary switching off state
of APs for energysaving purposes. A single Gaussian distribution was chosen as
the model for WiFi RSSI data throughout [3,4].
In [12, 5, 6], the multicomponent problem was noticed. In [12], authors showed
that human behaviors in the measurement environment (absence, sitting/standing
still, moving randomly and moving specifically) led to the bimodal phenomena in
experimental data. In this case, using the single Gaussian distribution to model the
distribution of RSSI is not appropriate. In [5,6], the GMM was used to model the
observed RSSI data. The reason is the changes in the surrounding environment, for
example, door closed/open and the direction of the user, will obviously change the
measured signal strength. However, authors in [5, 6] have not considered the
dropping problem in their work.
Parameter estimation: In order to estimate parameters of the probabilistic
model in the presence of missing data, the EM algorithm [8] seems to be the most
feasible estimator among available approaches. This algorithm is an iterative
method to find maximum likelihood estimates of parameters in statistical models.
Each iteration consists of two processes: The Estep and Mstep. In [3, 4], an EM
algorithm was proposed to estimate parameters of censored and dropped data, but
the multicomponent problem has not been mentioned. In [5, 6, 8], the EM
algorithm for the GMM can deal with the multicomponent problem but the
dropping problem has not been solved.
Considering the multicomponent and dropping problems presented in collected
WiFi RSSI data, this paper proposes to model WiFi RSSI distribution by the
dropping GMM and develop an extended version of the EM algorithm to estimate
parameters of such this model. Moreover, the Maximum a Posteriori (MAP)
method will be expanded to estimate the target’s position in the online positioning
phase in case of the online measurements suffer from two problems namely multi
component and dropping.
This paper consists of four sections. Section 1 is the introduction. In section 2,
our proposal is presented. In section 3, the effectiveness of the proposed approach
in the WFIPS is evaluated and compared to others. The paper is concluded in
section 4.
2. PROPOSED METHODS
Research
Journal of Military Science and Technology, Special Issue, No.57A, 11  2018 27
2.1. Modeling RSSI distribution by the dropping GMM
Let y⃗ = [y, y,⋯ , y]; y ∈ ℝ; n = 1 ÷ N be the set of unobservable, non
dropped data (complete data), N is the number of measurements, y are
independent and identically distributed random variables.
Let d⃗ = [d,⋯ , d] be the set of hidden binary variables indicating whether an
observation is dropped (d = 1) or not (d = 0); c is the limited sensitivity of the
WiFi sensor on the mobile target. Observable data are possibly dropped data:
x⃗ = [x,⋯ , x] where x =
y, if d = 0
c, if d = 1
.
The measurement model is depicted in figure 1.
Figure 1. The measurement model in case of the presence of dropped data.
2.2. An extended EM algorithm for parameter estimation in the presence of
dropped data
In a GMM, the likelihood function of y⃗ given Θ⃗ is:
py⃗  Θ⃗ = wpyθ
.
(1)
In Equation (1), Θ⃗ = w,⋯ , w ; µ,⋯ ,µ ; σ,⋯ , σ is the set of
parameters, θ = µ , σ is the set parameters of the j
Gaussian components
(j = 1 ÷ J), J is number of components, w are positive mixing weights which sum
up to 1.
Let Δ⃗ =
Δ ⋯ ∆
⋮ ⋱ ⋮
∆ ⋯ ∆
be the set of latent variables,
Complete data follow the
Mixture Gaussian distribution
Change of
environment
Complete data follow the
single Gaussian distribution
Presence of dropped data
Dropping
yn
c
dn=0
dn=1
yn xn
Electronics and Automation
V. T. Kien, H. M. Kha, L. H. Lan, “ WLAN fingerprinting based dropped mixture data.” 28
Δ =
1,when y belongs to j
component
0, otherwise
.
Due to the dropping problem, d⃗ is introduced into the likelihood computation,
complete data is now (y⃗ ,Δ⃗, d⃗ ), therefore, the GMM is modified to the dropping
GMM and the equation (1) becomes:
py⃗ ,Δ⃗, d⃗  Θ⃗ = wpy, dθ
Δ
. (2)
Hence, the loglikelihood is
lnpy⃗ ,Δ⃗, d⃗  Θ⃗ = Δlnwpy, dθ
. (3)
Estep:
Since hidden variables are not observable, instead of computing the log
likelihood directly, the expected value of the loglikelihood of complete data
(y⃗ ,Δ⃗, d⃗ ) given the observations x⃗ and previous estimated parameters are calculated:
QΘ⃗ ,Θ⃗
()
= E lnpy⃗ ,Δ⃗, d⃗  Θ⃗ x⃗ ;Θ⃗
()
(4)
= P
(Δ = 1)
lnw + lnpy, d; θ p Δ, y, dx;Θ
()
.
In equation (4), Θ⃗
()
denotes the current estimated parameters, k is the
iteration index.
In the case of (d = 0), equation (4) can be calculated as follows:
Q Θ
⃗ ,Θ⃗
()
=
(1 − d)
γ
x;Θ
()
lnw + ln(1 − ψ) + lnx; θ.
(5)
Throughout this paper, we use the notation ψ = P(d = 1) as the dropped
rate; (⋯) is the Gaussian distribution parameterized by θ and
γ
x;Θ
()
=
()
;θ
()
∑
()
;θ
()
.
In the case of (d = 1), equation (4) can be calculated as follows:
Q Θ
⃗ ,Θ⃗
()
= dw
()
lnw + ln(ψ). (6)
Combining Equation (5), (6), equation (4) ends up with:
QΘ⃗ ,Θ⃗
()
= (1 − d)
γ
x;Θ
()
lnw + ln(1 − ψ) + lnx; θ (7)
Research
Journal of Military Science and Technology, Special Issue, No.57A, 11  2018 29
+dw
()
lnw + ln(ψ).
Mstep:
The parameter reestimation formulae are obtained by computing the partial
derivatives of equation (7) w.r.t. the elements of µ
, σ , w ,ψ, and setting them to
zero:
μ
()
=
∑ (1 − d)γ x;Θ
()
x
∑ (1 − d)γ x;Θ
()
. (8)
σ
()
=
∑ (1 − d)γ x;Θ
()
(x − μ)
∑ (1 − d)γ x;Θ
()
. (9)
w
()
=
∑ (1 − d)γ x;Θ
()
+ ∑ w
()d
N
. (10)
ψ() =
∑ d
N
. (11)
The Estep and Mstep execute alternately until the improvement of the log
likelihood is smaller than a threshold. After convergence we have estimated
parameters:
μ
()
≈ μ
()
: = μ
; σ
()
≈ σ
()
: = σ; w
()
≈ w
()
: = w;
ψ() ≈ ψ(): = ψ.
(12)
Given equations (8÷11), both observable data and dropped data contribute to
estimates. On the other hand, when the dropping does not occur (d = 0), those
formulae reduce to the traditional EM algorithm for GMM [5]; when w =
1, w, ,w = [0, ,0] those formulae reduce to the EM algorithm for single
Gaussian distribution in the presence of dropped data [4]. It means that our
proposal not only can deal with the dropping and multicomponent problems but
also can work well in case collected RSSI data are complete or RSSI distribution
follows the single Gaussian distribution.
2.3. The online positioning/classification procedure
In this subsection, the Maximum a Posteriori (MAP) method will be utilized to
perform the classification. First, the posterior is calculated as follows
P(ℓ
x⃗ ) =
∏ p(xℓ)P(ℓ)
∑ ∏ p(xℓ′)P(ℓ′)
′
(13)
In equation (13), K and N is the total number of RPs and APs, respectively.
x is the online measurement from ith AP, x⃗ is the set of x (i = 1 ÷ N). We
considered that the RSSI measurements of different APs are independent, and the
prior P(ℓ) is equal for all locations.
30
th
those with the largest posteriors.
3.1.
set of parameters (true parameters):
[σ
dropping randomly with dr
(MSE) of estimated parameters of the proposed EM algorithm for the dropping
GMM (solid line) and traditional EM algorithm for GMM [5] (dashed line with “*”
marker) when the dropped rate changed from 0% t
V
The likelihood
In
AP in the training phase
The estimated position of the mobile object is obtained by:
Here,
Parameter estimation
In this simulation, complete data
, σ
. T
p
equation
]
. Kien,
(x
Knn
=
Figure 2.
ℓ
[3
H. M. Kha, L. H. Lan
)
(14
are
3. SIMULATION RESULTS AND DISCUSSION
, 4]
p
=
),
nearest neighbors chosen among the reference
;
(x
⎩
⎪
⎨
⎪
⎧
ψ
θ
µ
MSE after 1000 simulations of estimated parameters
ℓ
w
,
,,
,µ
)
, w
.
ℓ(
can be calculated as follows
,,
,,
x⃗ )
=
opped rate
(
,ψ
=
[−
, “ WLAN fingerprinting based
x;
,
∑
∑
80
θ,
are estimated parameters at the k
∈
∈
y
, −
,)
⃗ followed the GMM were generated with a
N
90]
ψ
ℓp
p(
=
.
. Figure 2 illustrates mean square error
(ℓ
ℓ
1000
Observable data
x
x⃗ )
⃗)
;
o 30%.
J
=
Electronics and Automation
2;
if x
if x
[w
dropped mixture data
≠
=
positions by taking
,
x⃗
c
c

w
was performed
th RP of the i
] =
[0
.
.5
(14
(15
, 0.
.”
)

)
5];
Research
Journal of Military
3.2
we generated a floor plan with 100 RPs (small red circles) and 10 APs (blue
circles) as illustrated in figure 3. The experiment was setup as follows: In the
training phase, 1000
Measured data (value of RSSIs) were computed by log
adding a Gaussian for re
of training positions (RPs) follow the sing
the GMMs with number of components is J = 2, 3, 4, 5, 6, respectively (10% for
each model). Collected data were also performed dropping with the rates are 10%,
20% and 30%. The radio map was built by employing equat
section 2.2 with
approaches introduced in [7]. For the online localization phase, 100 simulations
were performed. Each simulation, one online measurement per position fr
APs was generated in the same scenarios with the training data. The MAP method
proposed sub
for estimating the target’s position.
certain distance. The plot in the figure is computed by averaging the positioning
results of 100 simulations. It can be seen that the proposed method outperforms the
others.
distance error
proposal
dropped mixture data, while authors of [4] is
problem, the proposal in [5] have not considered the dropping problem, works in
[7] could not deal with both problems, experiments with artificial data simulations
result shows that our proposal is able to cope with th
measured Wi
Fig
floor plan.
. Positioning accuracy
In order to evaluate the effectiveness of the proposed approach in the WF
Figures 4, 5, 6 show the probability that the posi
In term of Wi
ure
When the dropping occurs with ratios are 10%, 20% an
which
3. The computer
section 2.3 and classification rules introduced in [4,5,7] were used
of our proposal reduces
Fi RSSI data.
Science and Technology, Special Issue, No.
J =
produced best results

4
Fi fingerprinting based indoor positioning in the presence of
measurements are collected for each RP from all APs.
, stochastic approaches proposed in [4,5] and deterministic

ﬂecting the ﬂuctuation of the signal [5]. The data at 50%
generated
.
9.18
le Gaussians, randomly, the rest follows
%,
Fig
results when the observable training
online
1
not able to solve the multi
6.71
ure
% and
data ratio w
57
tioning error is lower than a
e phenomena presented in the
4.
A,

Comparison of positioning
11
distance path loss model
18.46
 201
ions developed in sub
ere
8
%
d 30%, m
compared to
90%.

component
om all
ean of
IPS,
31

the
and
Electronics and Automation
V. T. Kien, H. M. Kha, L. H. Lan, “ WLAN fingerprinting based dropped mixture data.” 32
Figure 5. Comparison of positioning
results when the observable training and
online data ratio were 80%.
Figure 6. Comparison of positioning
results when the observable training and
online data ratio were 70%.
Further, figure 7 vadidates that dropping GMM is still appropriate to model data
with single Gaussian histogram. In this simulation, the measured data at 100%
positions follow the single Gaussians, the dropped rate is 20%. Figure 8 shows that
our proposal produces the same results as the standard GMM when data are
complete.
Figure 7. Comparison of positioning
results when the observable training and
online data ratio were 80%; data at
100% positions follow the single
Gaussians.
Figure 8. Comparison of positioning
results when the observable training and
online data ratio were 100%; data at
50% positions follow the single
Gaussians.
4. CONCLUSION
Operation states of the WLAN and the variations of received signal strength in
the real indoor environments are responsible for the dropping and multi
components problems, and then have strong effects on the accuracy of WFIPS. In
this paper, novel approaches have been introduced to take into account the
phenomena presented in collected WiFi RSSI data due to those problems. When a
part of data follows the dropping GMM, by utilizing our proposed EM algorithm,
error of estimated parameters has been reduced and, consequently, positioning
results can be improved considerably. It has to be noted that the proposed approach
Research
Journal of Military Science and Technology, Special Issue, No.57A, 11  2018 33
still works well in case measured data are complete (dropped rate is 0%) or
measured data follows single Gaussians. In the future work, we are going to make
a big enforce of labor work for gathering real data and evaluate our proposed
method on the collected data.
REFERENCES
[1]. L. Mainetti, L. Patrono and I. Sergi, “A survey on indoor positioning
systems,” in Prod. 22nd Int. Conf. on Software, Telecommunications and
Computer Networks (SoftCOM), 2014.
[2]. K. Kaemarungsi and P. Krishnamurth, “Modeling of indoor positioning
systems based on location fingerprinting,” Proceedings of the INFOCOM,
Hong Kong, March 2004.
[3]. K. Hoang and R. HaebUmbach, “Parameter Estimation and Classication of
Censored Gaussian Data with Application to WiFi Indoor Positioning,”
Proceedings of the IEEE International Conference on Acoustics, Speech and
Signal Processing (ICASSP),Vancouver, May 2013.
[4]. K. Hoang, J. Schmalenstroeer, and R. HaebUmbach, “Aligning Training
Models with Smartphone Properties in WiFi Fingerprinting based Indoor
Localization,” Proceedings of the IEEE International Conference on
Acoustics, Speech and Signal Processing (ICASSP), Brisbane, April 2015.
[5]. M. Alfakih, M. Keche and H. Benoudnine, “Gaussian Mixture Modeling for
Indoor Positioning WIFI Systems,” 3rd Int. Conf. on Control, Engineering
and Information Technology (CEIT), Tlemcen, Algeria, 2015.
[6]. A. Goswami, L. E. Ortiz, and S. R. Das, WiGEM, “A LearningBased
Approach for Indoor Localization,” ACM CoNEXT, Tokyo, Japan, 2011.
[7]. Xuxing Ding, Bingbing Wang and Zaijian Wang, “Dynamic threshold
location algorithm based on fingerprinting method,” Wiley ETRI Journal,
DOI: 10.4218/etrij.20170155, 2018.
[8]. G. Lee and C. Scott. “EM algorithms for multivariate Gaussian mixture
models with truncated and censored data,” Computational Statistics & Data
Analysis, Vol. 56, no. 9, pp. 2816–2829, September 2012.
[9]. T. Roos, P. Myllymaki, H. Tirri, P. Misikangas, and J. Sievanen, “A
probabilistic approach to wlan user location estimation,” International
Journal of Wireless Information Networks, Vol. 9, no. 3, pp. 155–164, 2002.
[10]. M. Youssef and A. Agrawala, “The Horus WLAN location determination
system,” in Proc. ACM MobiSys, 2005, pp. 205–218.
[11]. P. Mirowski; D. Milioris; P. Whiting and T. Kam Ho, “Probabilistic
radiofrequency fingerprinting and localization on the run,” Bell Labs
Technical Journal, Vol. 18, no. 4, pp. 111–133, 2014.
[12]. Jiayou Luo and Xingqun Zhan, “Characterization of Smart Phone
Received Signal Strength Indication for WLAN Indoor Positioning Accuracy
Improvement,” Journal of Networks, Vol. 9, No. 3, March 2014.
[13]. K. Kaemarungsi and P. Krishnamurth, “Properties of Indoor Received
Signal Strength for WLAN Location Fingerprinting,” In Proceedings of the 1st
Annual International Conference on Mobile and Ubiquitous Systems:
Electronics and Automation
V. T. Kien, H. M. Kha, L. H. Lan, “ WLAN fingerprinting based dropped mixture data.” 34
Networking and Services (MOBIQUITOUS 2004), Boston, MA, USA, 22–26
August 2004.
[14]. Youssef, M.; Agrawala, A, “The Horus WLAN location determination
system,” In Proceedings of the 3rd International Conference on Mobile
Systems, Applications, and Services, Seattle, WA, USA, 6–8 June 2005; pp.
205–218.
[15]. A. Haeberlen; E. Flannery; A. M. Ladd; A. Rudys; D. S. Wallach and L. E.
Kavraki, “Practical Robust Localization over LargeScale 802.11 Wireless
Networks,” In Proceedings of the 10th annual International Conference on
mobile computing and networking MobiCom, Philadelphia, PA, USA,
September 2004.
[16]. Chinyang Henry Tseng, and JingShyang Yen, “Enhanced Gaussian
Mixture Model for Indoor Positioning Accuracy,” 2016 International
Computer Symposium (ICS), Pages: 462  466.
[17]. S. Beller. Modelladaption zur Verbesserung von Fingerprinting basierter
Indoor navigation. Master Thesis approved by the University of Paderborn,
Paderborn, July 2014.
TÓM TẮT
ĐỊNH VỊ TRONG NHÀ SỬ DỤNG PHƯƠNG PHÁP “DẤU VÂN TAY” DỰA
TRÊN MẠNG NỘI BỘ KHÔNG DÂY TRONG TRƯỜNG HỢP TÍN HIỆU WiFi
ĐÔI KHI BỊ RỚT VÀ THAY ĐỔI VỀ CƯỜNG ĐỘ
Bài báo này đề cập đến hiện tượng rớt tín hiệu WiFi (dropping) do một số thiết
bị trong mạng nội bộ không dây (WLAN) không hoạt động hoặc bị lỗi; hiện tượng
chỉ số cường độ tín hiệu (RSSI) biến đổi do sự thay đổi của môi trường truyền sóng.
Các hiện tượng này dẫn tới phân bố của dữ liệu (là giá trị của cường độ tín hiệu) thu
thập từ các trạm thu phát WiFi thay đổi. Nói cách khác, dùng các phân bố
Gaussian không mô tả được chính xác phân bố của dữ liệu trong các trường hợp
này. Từ thực tế đó, các tác giả của bài báo đề xuất sử dụng mô hình Gaussian hỗn
hợp (GMM) mở rộng cho cả trường hợp dữ liệu bị rớt (dropping GMM) để mô tả
phân bố của dữ liệu. Kèm theo đó, thuật toán cực đại hóa kỳ vọng (EM) cũng được
đề xuất để ước lượng các tham số của mô hình trên. Các kết quả thực nghiệm trên
dữ liệu mô phỏng chỉ ra, khi các hiện tượng nêu trên xảy ra, hệ thống định vị trong
nhà sử dụng các kết quả nghiên cứu của bài báo có độ chính xác cao hơn các hệ
thống định vị khác.
Từ khóa: Định vị trong nhà, Phương pháp “dấu vân tay”, Thuật toán cực đại hóa kỳ vọng, Hiện tượng rớt tín
hiệu, Mô hình Gaussian hỗn hợp.
Received 2nd September 2018
Revised 20 th October 2018
Accepted 1 st November 2018
Author affiliations:
1 Hanoi University of Industry;
2 National Center for Technological Progress, Ministry of Science and Technology.
*Corresponding author: kien.vu@haui.edu.vn; vutrungkienfee@gmail.com
Các file đính kèm theo tài liệu này:
 3_kien_9123_2150406.pdf