linuxgreenscreen has been stuck in limbo for some time. My most pressing task it to construct a frontend GUI as I think this would help it reach a wider audience. As I’m also looking for work π·, so a demonstration of frontend development would be handy to have.
However, there’s another development opportunity for myself: machine learning in Python π I haven’t had many professional opportunities to do machine learning-based portrait segmentation, but, I have developed a metric f-tonne of knowledge in Bayesian statistics πͺ. So the plan for this side-quest is to work towards a real-time Bayesian portrait segmentation demonstration.
But first! I need ethically sourced data.
Ethical data for portrait segmentation
Opinion: To use an image of someone; you should have their consent.
More specifically, you should have some record that they consent to its use for the purpose that you have in mind. If you have doubt that this is the case, then you shouldn’t use it.
This is not, by and large, the convention in the machine learning (ML) literature. To be reasonable, there is a convention in research that the requirement for consent may be waived if approved by a special ‘Ethics Committee’ (see e.g. National Statement on Ethical Conduct in Human Research 2023). Yet, as I’ll show, this isn’t observed in the ML field, either.
I wanted to investigate the ethical practises of the key publically-available portrait segmentation data sets. I obtained a reasonable shortlist of data sets from the EasyPortrait pre-print. For each data set I checked:
- Were images sourced from subjects directly, or from a ‘public’ resource like the Internet?
- Was there a record of the process by which consent was obtained from
subjects in the documentation?
- Was consent revocable?
- Was there an ethics statement and/or committee approval?
- What license applied to the data set?
- Was there cursory evidence of copyright, license, or consent infringement?
Data set | Source | Consent | Ethics | License | Notes |
---|---|---|---|---|---|
Helen (2012)* | Internet (Flickr) | No | No | Unknown | |
LFW-PL (2013) | Internet | No | No | Unknown | |
EG1800 (2016)** | Internet (Flickr) | No | No | P+R | |
AiSeg (2018) | Internet (various) | No | No | MIT | |
CelebMaskHQ (2020) | Internet (various) | No | No | R (NC) | |
Persons Labelled (2020) | Internet (Pexels) | No | No | NC | |
LaPa (2020) | Derived (Megaface) | N/A | N/A | R (NC) | Dubious license |
FVS (2021) | Teleconf | No^ | No | GPL3 | README β MIT π |
iBugMask (2021) | Derived (Helen) | N/A | N/A | MIT | License violation |
P3M-10K (2021) | Internet (not stated) | No | No | MIT + terms | Β© violation |
PPR10K (2021) | Purchased | No | No | Apache 2.0 | Subset only |
Face Synthetics (2021) | Synthetic | N/A | N/A | R | |
FaceOcc (2022) | Derived (CelebMaskHQ) | N/A | N/A | MIT | License violation |
PP-HumanSeg14K (2022) | Teleconf | No^ | No | Apache 2.0 | |
EasyPortrait (2023) | Teleconf | Irrevocable | No | CC-derived |
License abbreviations: (N)on-(C)ommercial use e.g. research, teaching, personal
use. (P)ersonal use. (R)esearch or teaching use.
* Available via Wayback.
** Available via Wayback (follow link to dataset.zip) as at 09/2024.
^ The word ‘consent’ is not found in the paper, official website, or repository.
π¦ Now, I know I have high standards, but this is essentially a π©show:
- Rarely was consent formally sought from the subject of the photo.
- When consent was sought, it was implied “in perpetuity”: people should be able to withdraw consent.
- Not even one paper had an ethics statement.
- Many licenses used were not designed for photo assets (e.g. MIT is a source code license).
- Images collected from the photo sharing sites were occasionally used in violation of their license.
By way of example: P3M-10K (“privacy preserving”) simply stated that images were from the Internet with “free use license”, yet consider the first sample at P3M-10K GitHub:
- Pop it into Google Lens ποΈ
- The original (previously blurred facial features now entirely visible), comes up on an astrologist’s website π
- Looking at the source for the astrologist’s site we find another name.
- This name is of the photographer who has put the photo up on a widely used (problematic) image sharing website π
- The photographer’s website is currently not accessible due to security protocol errors: so there’s no way to verify consent was sought.
β did the subject of the photo agree to irrevocable conditions for downloading, copying, modifying, as stated by the image sharing website? Almost surely not. Did the researcher verify consent? Almost surely not.
Also, the third image in the sample is, wait for it, Kenneth Branagh. Ok, so are celebrities fair game? Maybe. But again put this into Google Lens and it (almost always) appears with “Β© Steffan Hill/LeftBank Pictures” π€. Just because a picture ends up on a pinterest doesn’t make it “free use”.
Some data sets are derived from sources which have more restrictive licenses than their sources, e.g. compare this LaPa license:
… freely available to academic and non-academic entities for non-commercial purposes such as academic research, teaching, scientific publications, or personal experimentation …
to the MegaFace license from which it was derived:
- Researcher shall use the Database only for non-commercial research and educational purposes.
Likewise, iBugMask and FaceOcc were released with permissive ‘MIT’ licenses, yet were derived from data sets like AFLW or CelebMaskHQ that have non-commercial research purposes only licenses π.
Finally, in many of these data sets, there are (a lot of) photos of children, toddlers, and babies floating around. I certainly do not feel comfortable from a consent and ethical perspective to include those.
Conclusion and plan for ethically-sourced data
The EasyPortrait data seems, to me, the least problematic as I can tell they have sought consent from the subjects of the images. However, I still don’t believe I should use the data as is, but I do believe that I can use the segmentation models that the EasyPortrait team provide.
My new (most ethical possible) plan is to gather my own data:
- capture (using my webcam) images of myself with different angles, offset, zoom, lighting, clothing, headset, microphone etc;
- also, capture some frames of plain background;
- use an existing EasyPortrait-based segmentation model to get a rough draft of the mask;
- replace some proportion of the backgrounds using CC0-licensed assets.
Promise #1: Details about this workflow will be available in a future blog post.
Setting out on a Bayesian portrait segmentation model
Firstly, I am lacking in familiarity with the computational methods in ML, let alone Bayesian ML. I have already found an extremely helpful guide by Jospin et al, 2022.
In spite of the authors’ intent to write an introduction to Bayesian neural networks (BNNs) for those who know ML concepts; I found it extremely helpful the other way around.
The examples from the manuscript are available at github.com/french-paragon, and a pre-print is also available: arxiv:2007.06823 π€ !
I have gleaned that there are a few further concepts and methods, particularly from ‘conventional’ ML, I should familiarise myself with:
- Functional models and the design of MediaPipe’s selfie segmentation model (e.g. via inspection using netron.app).
- Backpropagation for neural networks.
- Stochastic gradient descent and Adam optimisation algorithm.
- Stochastic variational inference and Bayes-by-backpropagation.
Once I’m ready, I’ll develop a plan for the project and share it with you.
Promise #2: I’ll share the project plan in a future blog post.
That’s all for now, stay tuned! I hope to wrap this project up quick βοΈ unless I get a job π·
References
Chu, L., Liu, Y., Wu, Z., Tang, S., Chen, G., Hao, Y., … & Xiong, H. (2022). PP-HumanSeg: connectivity-aware portrait segmentation with a large-scale teleconferencing video dataset. In 2022 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW) (pp. 202-209). IEEE. doi:10.1109/WACVW54805.2022.00026.
Jospin, L. V., Laga, H., Boussaid, F., Buntine, W., & Bennamoun, M. (2022). Hands-on Bayesian neural networks β A tutorial for deep learning users. IEEE Computational Intelligence Magazine, 17(2), 29-48. doi:10.1109/MCI.2022.3155327.
Kae, A., Sohn, K., Lee, H., & Learned-Miller, E. (2013). Augmenting CRFs with Boltzmann Machine Shape Priors for Image Labeling. In 2013 IEEE Conference on Computer Vision and Pattern Recognition (pp. 2019-2026). IEEE. doi:10.1109/CVPR.2013.263.
Kuang, Z., & Tie, X. (2021). Flow-based video segmentation for human head and shoulders. arXiv preprint arXiv:2104.09752.
Kvanchiani, K., Petrova, E., Efremyan, K., Sautin, A., & Kapitanov, A. (2023). EasyPortrait–Face Parsing and Portrait Segmentation Dataset. Pre-print available: arXiv:2304.13509.
Le, V., Brandt, J., Lin, Z., Bourdev, L., & Huang, T. S. (2012). Interactive facial feature localization. In Computer Vision–ECCV 2012: 12th European Conference on Computer Vision, Florence, 3(12) (pp. 679-692). Springer Berlin Heidelberg. doi:10.1007/978-3-642-33712-3_49.
Lee, C. H., Liu, Z., Wu, L., & Luo, P. (2020). MaskGAN: Towards Diverse and Interactive Facial Image Manipulation. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 5548-5557). IEEE. doi:10.1109/CVPR42600.2020.00559.
Li, J., Ma, S., Zhang, J., & Tao, D. (2021). Privacy-preserving portrait matting. In Proceedings of the 29th ACM international conference on multimedia (pp. 3501-3509). Association for Computing Machinery. doi:10.1145/3474085.3475512.
Liang, J., Zeng, H., Cui, M., Xie, X., & Zhang, L. (2021). PPR10K: A large-scale portrait photo retouching dataset with human-region mask and group-level consistency. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 653-661). IEEE. doi:10.1109/CVPR46437.2021.00071.
Lin, Y., Shen, J., Wang, Y., & Pantic, M. (2021). Roi tanh-polar transformer network for face parsing in the wild. Image and Vision Computing, 112, 104190. doi:10.1016/j.imavis.2021.104190.
Liu, Y., Shi, H., Shen, H., Si, Y., Wang, X., & Mei, T. (2020). A new dataset and boundary-attention semantic segmentation for face parsing. In Proceedings of the AAAI Conference on Artificial Intelligence, 34(7) pp. 11637-11644. AAAI. doi:aaai.v34i07.6832
Shen, X., Hertzmann, A., Jia, J., Paris, S., Price, B., Shechtman, E., & Sachs, I. (2016). Automatic portrait segmentation for image stylization. Computer Graphics Forum, 35(2) (pp. 93-102). doi:10.1111/cgf.12814.
Wood, E., BaltruΕ‘aitis, T., Hewitt, C., Dziadzio, S., Cashman, T. J., & Shotton, J. (2021). Fake it till you make it: face analysis in the wild using synthetic data alone. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV) (pp. 3661-3671). IEEE. doi:10.1109/ICCV48922.2021.00366.
Yin, X., & Chen, L. (2022). FaceOcc: A diverse, high-quality face occlusion dataset for human face extraction. arXiv preprint arXiv:2201.08425.