Automated Mobile Content Compliance Verification Using Multimodal Learning

Denipitiyage, Dishanika Dewani

Permalink

Access status:

Open Access

Type

Thesis

Thesis type

Doctor of Philosophy

Author/s

Denipitiyage, Dishanika Dewani

Abstract

The rapid expansion of the mobile app ecosystem has intensified concerns about exposure to inappropriate or misleading content, particularly for children. Although regulatory frameworks such as the GDPR, and app store policies aim to standardise age-appropriate content, mobile ...
See moreThe rapid expansion of the mobile app ecosystem has intensified concerns about exposure to inappropriate or misleading content, particularly for children. Although regulatory frameworks such as the GDPR, and app store policies aim to standardise age-appropriate content, mobile marketplaces still rely heavily on developer-declared ratings. Consequently, content rating compliance remains largely underexplored compared to privacy, security, and malware detection. Investigating the detection of content rating non-compliance in mobile apps, this thesis first introduces a multimodal similarity search pipeline to identify app metamorphosis, capturing substantial app evolution over five years. By combining text and visual embeddings with a majority-voting correspondence strategy, the study quantifies app progression and reveals the prevalence of rating inconsistencies in the Google Play. Second, the thesis proposes a vision–language representation learning framework that jointly analyses app descriptions and visual creatives to detect rating violations, leveraging a cross-attention module to align textual and visual semantics, while ListMLE loss models the ordinal structure of content ratings. Next, addresses cross-platform rating inconsistencies by leveraging the Apple App Store as a reference. A content-descriptor-driven data generation pipeline converts app creatives and descriptions into structured question–answer pairs, enabling interpretable descriptor-level prediction using a vision–language model. A two-stage training strategy combining supervised fine-tuning and mistake-driven preference optimisation significantly improves recall over baseline models, enabling cross-platform content compliance auditing in mobile app ecosystems. Building on this ordinal modelling, the thesis concludes with RankOOD, a unified framework that detects out-of-distribution samples by analysing class-wise ranking violations in model outputs, achieving state-of-the-art performance.
See less

Date

2026

Rights statement

The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.

Faculty/School

Faculty of Engineering, School of Computer Science

Awarding institution

The University of Sydney

Subjects

Vision Language Models
Multi Model Learning
Content Rating
Android apps
iOS apps
out-of-distribution detection