Texts in the health domain of the Danish DK-CLARIN LSP corpus come from sundhed.dk. All texts are in XML TEIP5 format (TEIP5DKCLARIN-format), with tokenisation, pos-tagging, lemmatisation and termhood annotation placed text externally in separate spangroups.