Show simple item record

FieldValueLanguage
dc.contributor.authorLin, Jinxu
dc.date.accessioned2025-10-20T00:35:41Z
dc.date.available2025-10-20T00:35:41Z
dc.date.issued2025en
dc.identifier.urihttps://hdl.handle.net/2123/34416
dc.descriptionIncludes publication
dc.description.abstractAs diffusion models gain widespread adoption, concerns over the misuse of copyrighted and private images have become increasingly prominent. A promising approach to mitigate these issues involves identifying the contribution of individual training samples in generative process, a task referred to as data attribution. Existing data attribution methods for diffusion models typically assess the contribution of a training sample by examining the change in diffusion loss when the sample is included or excluded during training. However, we contend that the direct use of diffusion loss fails to accurately capture this contribution due to the nature of its calculation. Specifically, these methods rely on computing KL-divergence, measuring the divergence between predicted and ground truth distributions. This indirect comparison of predicted distributions inadequately reflects the variations in model behavior caused by different training samples. To address these limitations, we propose the Diffusion Attribution Score (\textit{DAS}), a novel attribution score that enables direct comparisons between predicted distributions to evaluate the importance of individual training samples. DAS is grounded in rigorous theoretical analysis, which we detail to substantiate its efficacy in attributing data influence in diffusion models. Moreover, we present optimization strategies to accelerate DAS computations, making it efficient to apply to large-scale diffusion models. Extensive experiments conducted across diverse datasets and diffusion models highlight that DAS significantly outperforms existing benchmarks, achieving superior results in terms of the linear data-modeling score and establishing a new state-of-the-art in data attribution performance.en
dc.language.isoenen
dc.subjectArtificial Intelligenceen
dc.subjectDeep Learningen
dc.subjectImage Generationg Modelsen
dc.subjectDiffusion Modelsen
dc.subjectExplainable AIen
dc.subjectTraining Data Attributionen
dc.titleEfficient Training Data Attribution on Diffusion Modelsen
dc.typeThesis
dc.type.thesisDoctor of Philosophyen
dc.rights.otherThe author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.en
usyd.facultySeS faculties schools::Faculty of Engineering::School of Computer Scienceen
usyd.degreeMaster of Philosophy M.Philen
usyd.awardinginstThe University of Sydneyen
usyd.advisorXu, Chang
usyd.include.pubYesen


Show simple item record

Associated file/s

Associated collections

Show simple item record

There are no previous versions of the item available.