Parametric probability distributions are central tools for probabilistic modeling in data mining, and they lack in functional data analysis (FDA). In this paper we propose to build this kind of distribution using jointly Quasi-arithmetic means and generators of Archimedean copulas. We also define a density adapted to the infinite dimension of the space of functional data. We use these concepts in supervised classification. 1. QAMML distributions Let (Ω,A, P ) a probability space and D a closed real interval. A functional random variable (frv) is any function from D×Ω → R such for any t ∈ D, X(t, .) is a real random variable on (Ω,A, P ). Let L(D) be the space of square integrable functions (with respect to Lebesgues measure) u(t) defined on D. If f, g ∈ L(D), then the pointwise order between f and g on D is defined as follows : ∀t ∈ D, f(t) ≤ g(t) ⇐⇒ f ≤D g. (1) It is easy to see that the pointwise order is a partial order over L(D), and not a total order. We define the functional cumulative distribution function (fcdf) of a frv X on L(D) computed at u ∈ L(D) by : FX,D(u) = P [X ≤D u]. (2) To compute the above probability, let us remark that, it is easy to compute the probability distribution of the value of X(t) for a specific value of t, and this for any t ∈ D. Then we define respectively the surface of distributions and the surface of densities as follow : G : D × R → [0, 1] : (t, y) 7→ P [X(t) ≤ y] (3) g : D × R → [0, 1] : (t, y) 7→ ∂ ∂t G (t, y) (4) We can use various methods for determining suitable g and G for a chosen value of X. Thus for example, if X is a Gaussian process with mean value μ(t) and standard deviation σ(t), then, for any (t, y) ∈ D × R, we have : G (t, y) = FN (μ(t),σ(t))(y) and g (t, y) = fN (μ(t),σ(t))(y). In the following we will always use the function G with a function u of L (D), so, for the ease of the notations, we will write : G [t;u] = G [t, u (t)]. We will use the same notation for g. In what follows we define our parametric families of probability distributions. Let X be a frv, u ∈ L(D) and G its Surface of Distributions. Let also φ be a continuous strictly decreasing function from [0, 1] to [0,∞] such that φ(0) = ∞, φ(1) = 0, where ψ = φ must be completely monotonic on [0,∞[ i.e. (−1) d k dtk ψ(t) ≥ 0 for all t in [0,∞[ and for all k. We define the Quasi-Arithmetic Mean of Margins Limit (QAMML) distribution of X by : FX,D(u) = ψ [
[1]
L. A. Li︠u︡sternik,et al.
Elements of Functional Analysis
,
1962
.
[2]
Monique Noirhomme-Fraiture,et al.
Classification de fonctions continues à l'aide d'une distribution et d'une densité définies dans un espace de dimension infinie
,
2007,
EGC.
[3]
Monique Noirhomme-Fraiture,et al.
A Probability Distribution Of Functional Random Variable With A Functional Data Analysis Application
,
2006,
Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).
[4]
R. Nelsen.
An Introduction to Copulas
,
1998
.
[5]
H. Joe.
Multivariate models and dependence concepts
,
1998
.
[6]
Monique Noirhomme-Fraiture,et al.
An approach to Stochastic Process using Quasi-Arithmetic Means
,
2007
.
[7]
J. Aczél,et al.
Lectures on Functional Equations and Their Applications
,
1968
.
[8]
Robert Tibshirani,et al.
An Introduction to the Bootstrap
,
1994
.