How to digitise slides. Recommendations and working lists for the reproduction of a very special artefact

Some reflections on photographing techniques and shooting parameters

As stated in the introduction to this document, this section will be limited to technical questions. For practical advice see section “Practical examples of accomplished digitising projects”.  The following points were selected as most of them are addressed by the practical recommendations given by members of the “A Million Pictures” working group.


4.1 White balance adjustment and automatic white tracking
Slides are photographed in a working space where light, which looks “white” to the human eye, has in fact often a certain colour temperature. Depending on the kind of light source used the light rays falling on an object can be blueish (with fluorescent light), reddish-yellowish (incandescent) or greenish (Neon). The eye waves this “tinting” away as, on the one hand, the brain is trained to think of certain objects as white (e.g. paper for the printer, damask table cloth) or green (e.g. grass). On the other hand,

according to Zeng (2001, p. 11-12) the cones, excited by the colour which is preponderant in the mentioned cases, will reduce their sensitivity to reach the same level of stimulation as the two other groups which annihilates the tint-effect and makes the shade appear white.


Nevertheless, when a human being really concentrates on a “whitened surface” it can detect the “tint”: e.g. the appearance of a wall painted in “RAL 9010” which looks white in day light can have, in the evening, a shimmer of brown-orange or blue depending on the colour of the reflected light from the lamp in the room. The same can happen with coloured surfaces. What the eye normally ignores, the camera cannot: a colour shade produced on a slide will be depicted and give an erroneous impression of the original colours. Therefore, this nuisance influence has to be neutralised; the camera system has to be told what “white is” by giving it a reference.


Light qualified as being “warm” or “cold” is a subjective impression. Warm and cold refer to a bright sunny day at noon which is defined as neutral: sunlight at midday (c. 5500-6000 Kelvin) is considered “white”, higher Kelvin values (e.g. blue sky of 10.000 Kelvin or more) are declared “cold”, lower (e.g. candle light of 2000 Kelvin) are talked about as “warm” (see also section “Light sources for the digital reproduction of slides and lanterns”).  As the camera registers the tint of the supposedly “white” light, the photographer has to counterbalance what, later on the taken photograph, would appear as “too cold” (blueish, greenish) or “too warm” (reddish, yellowish, orange). As the neutral 5500-6000 Kelvin of a sunny midnoon are the parameter in photography, a scene with another light condition has to be “colour-corrected” by the camera (automatically or manually) before the image is taken, this to avoid time-consuming “colour-repair” in post production which may even not lead to an optimal result. The correction is done by adjusting the camera’s colour channels. This method is called “white balance” or “white tracking” and has to make sure that the colours of the slide are reproduced without discolouration.


The digital still-camera has pre-installed settings (“auto white tracking”) that correspond to the lightning of which the photographer wishes to counterbalance the misleading effect. These automatic functions may not always work correctly, therefore most photographic experts suggest to set the parameters manually, especially in situations when the system is confronted with mixed light. A photographer can keep out daylight by closing the curtains to avoid a mixture of artificial and natural light which disturbs the automatic white tracking function. But when two lamps use light bulbs from the same production shift which have aged differently and have altered heterogeneously, they can send light waves of different frequency and length which confuses the camera.


For a manual adjustment of the white balance the photographer has to indicate the camera what it has to consider as “white”. This can be a certain area on the artefact, a definitely white object (e.g. a shade of bright white paper), a bought “white card” or even a “grey card”. Before taking the picture, a white element on the object is selected, or the grey card is put in the place of the slide. Whatever is focussed, it has to fill entirely the opening of the objective to avoid rays from objects around. The reflected light is measured by the camera system and, when the chosen white is registered, it will interpret the setting as “white” for the next photograph. As it represents an average in light this arrangement avoids that, in a heterogeneous lighting situation, the objective takes an uncontrolled element as its reference point.  Keeping the reference setting “in mind”, the camera will add, with the help of its colour channels, the opposite colour to the image, e.g. more blue  to “cool down” an image which may otherwise risk being reddish, or more red to make a “cool” blueish image appear warmer. The intention is to make images look as if they were photographed ‘outside’ at noon on a sunny day; by tracking the white the photographer adjusts the three RGB colour channels in such a way that they produced a “balanced” picture which looks “neutral”. Determining the setting for white has to be done with attention to the surrounding which should be neutral and light absorbing: thus no coloured clothes for the photographer as light that falls on her/his red shirt is reflected onto the slide and may give it a tint.


Indicating the white point with the help of a slide can be difficult as it is a transparent, not an opaque object. When a slide is projected on a screen “white” is more often produced by the light beam that passes through unpainted areas on the glass (especially with self-made or black and white ones) than by a white dye. In a camera-stand what is measured is the whiteness of the beam from the light box. Thus a replacement is needed. If one wishes to photograph with “above light”, it is recommended to use a grey or white card to white balance the camera. Why a grey card? A white surface reflects all colours fully while a grey card reflects equal portions of each colour (Eastman 1992, p. 54), therefore a grey card can also be considered neutral. The grey corresponds to one particular strip on the grey scale: the one which reflects precisely 18% of the light that hits it. Michael Langford  (2008, p. 26) explains the reason: when all the cones in a human eye are equally activated, they see the light as “white or neutral grey”. According to John Hedgecoe (2004, p. 229) most light meters are calibrated against this reference value.


It is highly recommended to do “white tracking” before the take as it needs a lot of experience to correct the colour balance in post production: light and colour correction in film business is done by a trained expert (called “grader” or “timer”) with many years of experience. Photographer Tom Striewisch (2009, p. 412-413) and others insist that a later correction is unproblematic with the RAW format; with files in JPEG or TIFF, on the contrary, it risks to degrade the image’s quality. As colour correction is empirical and time-consuming, white balancing has to be done 100% correctly. However a tendency can be observed in musea: “Capture now – process later” is beginning to take a hold in some institutions.” (Frey 2011, p. 119) Nevertheless, to capture seriously from the start is always better as it is not known where, by whom and how many years later processing can be done. The original slide may no longer be available as reference point.


Practical hints given by photographersSome photo-manufacturers sell grey cards, but if none is available for the digitising process photographer Tom Striewisch (2009, p. 382) recommends directing the objective on the palm of the hand which is one stop brighter than the card. This has to be taken into account for the shooting: the aperture needs to be opened one stop more.


It is best to digitise the collection with the same still-camera as even two consecutive still-camera models from one company equipped with identical settings may reproduce the same image slightly different in look on the same monitor. This effect can be due to modified color filters on the sensor in the camera in combination with the way the RAW converter calculates the color temperature of the image (see John Bosley, “Understanding White Balance – A Beginner’s Guide” and a comment by Iliah Borg,, accessed 3.11.2017).


4.2 Flaws in reproduced slides
Several flaws can occur when taking digital photographs. They are produced by the camera’s technique.


4.2.1 White noise

What is called “white noise” is an irregular dispersion of clear dots, visible in dark parts of the image and caused by erroneous information emitted by the sensor (see section “Scanning deficiencies, and scanning defective slides” in the scanning part).  According to Tom Striewisch (2009, p. 44-45) higher temperatures intensify the “inclination” of light-sensitive units to produce noise, thus camera producers try to keep the chip away from warmth emitting parts such as accumulators and displays. The still-camera should not be put in the direct sun as this will warm up the sensor and cause unwanted artefacts.


Another source for noise is the selected ISO-number which makes the sensor more light-sensitive. (“ISO” stands for International Organisation for Standardization and has replaced the former photographic standards for sensitivity: DIN (Deutsches Institut für Normung e.V.) and ASA (American Standard Association)). With most still-cameras ISO 100 is the lowest level and signals low speed (but 50 ISO also exists), ISO 3200 or 6.400 (or even more) form the other end of the scale and mean extreme response to light. The higher the ISO-number selected for a shot, the higher the risk of noise. As photographic expert Anselm Wunderer (2015, p. 110-111) explains, a sensitivity of 100 ISO corresponds to the capacity of a typical sensor in a DSLR camera. Higher ISO values are the results of an amplification which also intensifies the noise inherent to all electronic devises. According to him, 200 ISO means that only 50% real light energy is present; to reach the needed illumination the same amount has to be added with the help of the amplifier increasing it to 100%; with ISO 400 the amplification is 200% etc. As a consequence, the picture’s quality is reduced: its dynamic range is smaller, its colours are less nuanced, and dots appear. If they are coloured, they are called “colour noise” or “chromatic noise”. Therefore, most photography books recommend to use the lowest ISO-level possible.


4.2.2 “Hot” pixels and defect pixels

Hot-pixels are pixels on the sensor that are not working correctly while defect ones (also called “stuck pixels”) do not work at all. Both can be recognised on the reproduction as singular dots which never move: the dots are black when the pixel is not working at all, and in colour when incorrectly working. According to Tom Striewisch (2009, p. 44-45, 385, 402) the colour dots, especially visible in dark parts of the image, are due to single light-sensitive units that produce “noise” which means they send wrong light information. If the concerned pixels cannot be repaired, the dots have to be eliminated in post production.


4.2.3 Moiré

This interference of two rigid structures – the sensor’s grit and the raster on the photographed object – is explained in the scanning part. It can be reduced or eliminated by retaking a picture from a slightly different angle and distance (see section “ Scanning deficiencies, and scanning defective slides” in the scanning part, also for Newton rings).


4.2.4 Blooming

Sometimes white areas show so-called “blooming”: strong light can produce such a high charge in a photo-diode that the electricity “jumps” onto the neighbouring dark (i.e. charge free) photo-diode which transmits it to the next etc. This creates over-exposure across the image in horizontal or vertical direction until the charge is stopped by the sensor’s edges. Blooming produces visible bright lines on the reproduction which end in the described case at the image borders (Striewisch 2009, p. 43-44, 370). Normally charge-coupled device (CCD) sensors are effected; as most cameras nowadays work with CMOS sensors which are rarely concerned, blooming in photographed images is less a problem than in scanned pictures.


4.3 Compression and file formats – RAW or not RAW?
One of the important decisions the photographer has to take before digitising a slide concerns the file format s/he will work with. There are several formats; most professionals choose RAW. However this choice is not without certain risks.


4.3.1 The file formats – capturing and delivery


There are two file “stadiums” of the picture: the file that contains what the still-camera has photographed (“capturing format”), and the one in which the content it kept, reworked, shared and stored (“delivery format”). Before buying a digital still-camera it is necessary to know what will be done with the reproductions. If long-term storage is one of the targets, the camera should be able to produce uncompressed RAW images as capturing and delivery format. This allows the widest intervention in post production which may be needed for different uses of the images. (Most photographic experts write that RAW files are uncompressed, but UPDIG (p. 19) seems not to agree as the authors write about “’lossless’ compression types, such as LZW-compressed GIF and TIFF, PSD and most raw file formats”.) For short time- or “only internet”-use it could be acceptable to acquire a camera generating compressed files if the degree of compression can be determined by the photographer. All photographic experts recommend working in RAW, but the format is not exempt from problems.


4.3.2 The RAW-format


The RAW format is nicely explained by Michael Reichmann and Jürgen Specht (2005) from the OpenRAW Working Group: “[…] it’s the output of the camera’s sensor with minimal processing. This means it contains all of the data about the image captured. Regard it as a digital negative. A negative though that has not yet been processed. The image is therefore latent; there, but undeveloped. This offers huge advantages to the digital photographer, because it allows us to re-visit our unprocessed files at any time in the future, and reprocess them again, as we find appropriate.”    ( This concept of the RAW file is seemingly wide spread: “It is comparable to the latent image contained in an exposed, but undeveloped, piece of film.”    ( As with the latent image which needs to undergo a process to turn into a (negative) image, the RAW file needs a “developer”, a special software (RAW converter) to make its content visible, accessible and reworkable in post production.


Due to the sensor’s registration technique, RAW and other file formats work with “colour separation” which means that a colourful slide, when photographed by a device using the RGB colour space, is reproduced in three grey “portions”, each containing the light values of one colour – red, green or blue – and depicting the same areas, but in different grey tones. Striewisch (2009, p. 47-49, 51) states that

RAW works with a colour depth of at least 10-bit, but can handle more than 14-bit per channel, and has one colour channel per pixel when used on a sensor with a Bayer mask which reduces its storage space compared to a TIFF file with three 8-bit colour channels per pixel (also 16-bit per channel is used in digital photography). If an uncompressed TIFF file format is selected, the file is nevertheless automatically submitted to the camera’s processor to track the white balance or increase sharpness, saturation and contrast (Striewisch 2009, p. 47, Matthai 2008, p. 52). A JPEG file also has 8-bit per pixels, due to its high compression rate it needs the smallest memory size of all. A higher colour depth makes the data volume heavier: one pixel in 3 x 8-Bit has c. 15 MB while one in 3 x 16-Bit has c. 30 MB. Thus RAW files reproduce the captured light values identical to what was reflected from the slide, however the data files are bigger than for the JPEG format; however, memory cards have today up to two terabytes and can cope with it. (For more information see section “ The digital still-camera and its components. 9. Memory card, storage capacity and bitrate” in the photographic section.)


According to post production specialist Matthias Matthai (2008, p. 51), the more a camera is made for professionals, the less it is programmed to manipulate ‘behind the photographer back’. The RAW modus adjourns eventually necessary corrections of image parameters from the camera to the workstation where it can be done under better controlled conditions.


4.3.3 JPEG, TIFF and other formats


Many experts insist that the camera should not just deliver JPEG, or TIFF in 8-Bit. Michael Reichmann and Jürgen Specht (2005) explain: “Once the camera creates the JPG file it throws away the sensor’s data, and its [sic] ready to take the next shot.” Although a JPEG image can be reworked in post production, as the image taken was processed and the file created by the camera, it has lost information (due to compression) which reduces it manipulability to a certain extent. According to Reichmann and Specht, when JPEG is chosen the “latent impression” is submitted to important interferences: “You have two basic choices. You can have the camera process it, or your can do it yourself later on the computer. If you choose that the camera does it, you set the camera to output a JPG file. This means that in a fraction of a second the camera will process the image, permanently setting the linearity, matrix processing, white point, color balance, color space, sharpening, contrast, brightness, and saturation, and then will save the file to the camera’s memory card in an 8 bit compressed format – i.e.: a JPG file.” (


Thus if JPEG is selected as delivery formats, the camera electronics will automatically “optimise” the image following pre-programmed parameter. The processing software will intervene and eliminate what it sees as a “flaw”, then present the reworked reproduction without indicating what and how was precisely manipulated. The same happens if TIFF is selected, only in the RAW modus nothing is modified (Matthai 2008, p. 51-52, Brümmer 2006, p. 4). (However, according to Reichmann and Specht (2005), the image is sightly processed to appear on the LCD of the camera, but it seems that this does not affect the output of the file.) Most (amateur) cameras interfere unasked and change the reproduction by applying algorithms. This kind of automatic and uncontrollable intervention by a software program happens also with film scanners and is criticized by the Swiss research group DIASTOR as “black box operation” (Flueckiger 2016, p. 109). Also no manufacturer of still-cameras will reveal how the processor is programmed.


Photographic experts Eib Eibelshäuser (2005, p. 262, 264) and Hans Brümmer (2006, p. 3) point to another problem: JPEG is a “cumulative compression procedure” which means that each time when it is opened with a post production software, the manipulated file is compressed again once the work is finished. As JPEG compresses lossy, errors in the file are accumulated. Eibelshäuser recommends using JPEGs only for proxies. For storage he advises uncompressed RAW files, or TIFF files lossless encoded with the LZW algorithm. For Eibelshäuser, the also LZW compressed file format GIF (stands for Graphics Interchange Format, in the late 1980s created by CompuServe to send heavily compressed diagrams and images with the help of a modem) is only usable for small data amounts as GIF has a colour depth of 8-bit. As to the uncompressed files format PNG (pronounced “ping”), it can handle a colour depth of 24-bit.


In earlier years manufacturers also proposed TIFF files uncompressed, but is seems that this file format is about to be given up as camera standard. Photographic expert King (2017, p. 124) recalls that, although it is no longer available as capturing format, it can still be used as conservation format. Photographic experts see it well suited for storing rasters and other graphic images. The authors of the Memoriav report (Jarczyk 2017, p. 36) recommend TIFF as “archive copy” and suggest: “The archive copy is never to be used. It has to be saved on a reliable medium […]. The archive copy is “raw”. Its settings may not be changed, no retouching is allowed which could change the content of the original. It can be of note to produce a second archive copy which can be kept, retouched and which serves to produced working copies.” Nevertheless this decision is not without consequences: encoding is used to transcode data files into another format – from RAW as capturing to TIFF as delivery format – which leads to a loss of data, the new format is therefore always a bit worse than the one it derived from.


The difference between the file-formats thus lies in colour depth and manipulation as well as general readability: TIFF and JPEG are read by any hardware and software and can be shared without problem, while RAW needs the already mentioned special software (RWA converter) which has to be accepted by the post production software which the archive uses. They differ also in data volume (JPEG: small, TIFF: big, RAW: between JPEG and TIFF. According to Langford (2008, p. 109) a RAW file is one third smaller than a TIFF file); they also present differences in producing annoying artefacts (e.g. JPEG can add visual phenomenon to the original image) and in quality: only RAW with its original, not treated data guarantees sustainability and transcoding into all kind of proxies. As JPEG files are heavily manipulated to make them look fresh and attractive (Striewisch 2009, p. 155), they are limited in their usability and are not future proofed, and they employ a compression algorithm that is considered unsuited for lines and other straight iconic or textual graphics. However, what makes them interesting is that they are immediately shareable e.g. on the internet.


4.3.4 The RAW format risk


RAW file formats are extremely popular, and most (professional) photographers use them. But they are not without a certain risk: each RAW format is produced for a specific camera, a fact which could lead to problems in the future. Photographic expert Michael Langford (2008, p. 109) stresses that a “Canon Raw File” (CRF) is only created for still-cameras built by Canon, and a “Nikon Electronic File” (NEF) only sold for those by Nikon. Michael Reichmann and Jürgen Specht (2005) warn: “The [RAW] file is yours, and you can do with it as you wish, both technically and artistically. But can you? You can if you have a copy of the manufacturer’s proprietary software for decoding the file. And herein lies the problem. What happens if you’ve lost your software disk? What happens if you change computers and cannot find the CD any longer? What happens if […] (the makers of your camera) goes out of business, and no longer has a copy of the software on their web site for you to download? What happens when your new […] computer no longer can read CDs, or DVDs, or its operating system cannot deal with something as old and arcane as Windows XP or Mac OSX?” Thus updating computer and camera software regularly, migrating the files too (and checking whether the result is ok or corrupted) is necessary to keep the files “alive”.


Reichmann and Specht draw attention to the dependency of the digital industry: in the mid-2005 there were at least 100 different kinds of RAW formats on the market of which some are no longer accessible. Camera manufacturers often change software when introducing a new camera model (and limit backwards compatibility), sometimes they even “encrypt” files with patented software (encryption means the altering of data to conceal the content for not-entitled persons) which can make files unreadable for post production software: some manufacturers may (and have already) refuse(d) to include special RAW versions into their program. Therefore the question of sustainability matters strongly to prevent the archive from loosing access to its own images.


Adobe has tried to develop DNG as an alternative to RAW. “DNG” stands for “Digital Negative”, a concept sometimes also used for a Raw image file as it has to be “developed” by the converter to become “visible”. According to the manufacturer’s own saying it is “a publicly available archival format for the raw files generated by digital cameras”. The software giant promotes it as the ultimate solution when working with RAW files: “By addressing the lack of an open standard for the raw files created by individual camera models, DNG helps ensure that photographers will be able to access their files in the future.”   (


Adobe’s website mentions those companies who have adopted it: “Hundreds of software manufacturers such as Apple and Google have developed support for DNG. And respected camera manufacturers such as Leica, Casio, Ricoh, Samsung, and Pentax have introduced cameras that provide direct DNG support.” In how far the software producer is completely open about his own creation has to be checked by experts. Besides, Adobe has changed its business policy: it no longer accepts the use of its post production software package as a stand alone solution (used on an independent computer) but imposes an online registration which includes a high risk of data grabbing for the user.


On the other hand: as the big photo-companies have their own production line with often particular parameters to make the use of products by their competitors difficult or impossible, an open source format, although it has proven not to be so widely taken up in the last twenty years, could be interesting, at least for those who work with products of a company on the mentioned list. The experts who wrote the guidelines (UPDIG) seem to estimate that it is “safe” for storage: “Converting raw files to DNG format is considered an excellent method for archiving raw files. DNG was designed as a more universal file format than camera-specific raw formats such as NEF or CR2.” (Photographers Guideline 4.0) Also the Swiss preservation association Memoriav (Jarczyk 2017, p. 46) sees DNG as one of several solutions for long-term storage of reproduced photographs.


Nevertheless, RAW files still offer the broadest working basis for a photographer. JPEG is generally readable and may outlive many proprietary RAW file formats, but it is to be rejected for its high compression and its limited “dynamic range” which produced a higher risk of clipping (Striewisch 2009, 155). For the camera output, there is no good alternative to RAW: RAW file images have “full dynamic range”, and although they need a converter to be made readable, they deliver the data from the sensor “untouched”, contain metadata about the taking conditions of the image, and they can be handled by most post production software brands and computer processing systems.


To resume this point: as no file format is perfect, the camera should be capable of shooting in RAW which prevents the camera from processing (except to create a control picture for the liquid crystal display and to transform the analogue information into data) and allows to save all the information taken on brightness up to 16-bit (according to Reichmann and Specht). It is important to make sure that the RAW-format used by the still-camera can be read by the RAW-converter of one’s own post production software.


4.4 Metadata
By the way, metadata, which are data on data, are added automatically by the camera, the scanner or the mobile phone to each produced file. They contain technical information e.g. on day and time of the photographing act, camera type, resolution, camera settings, colour space etc. Later other properties concerning e.g. copyright or content can be added. These data are stored in a so-called “header”, referring to the fact that the information is placed at the beginning of a file and is read by a digital devise before it accesses the content. Standards for metadata specify which kind of information has to be kept to assure that the image can be always be identified and is retrievable everywhere. The White paper on metadata by the International Press Telecommunications Council (IPCT) states: “For metadata to be effective, it must be incorporated into the workflow at all phases of image production, distribution and use and then remain with the image.” (IPCT 2007, p. 9)  ( An association, the Metadata Working Group ( , was created by international digital hardware and software manufacturers in 2007 to keep metadata standards up to date.


A widely used standard for image files is Exchangeable Image File Format (Exif), designed in 1995 by the Japan Electronic Industries Development Association (JEIDA) for digital video cameras; all major providers have programmed their camera’s processors to “write extensive Exif metadata to each picture” (IPCT 2007, p. 17) and combine them with JPEG, TIFF or RAW files from the camera or with “PSD”, Adobe’s own post production file format. Exif also works with Adobe’s DNG. Some people have issued warnings against Exif as it is connected to the Global Positioning System (GPS) and assigns automatically geographic coordinates during a photo-shooting (called “geotagging”): the coordinates are supposed to be a kind of “keyword” but can be a threat to people’s privacy or even  security.  ( Another media data model, the Information Interchange Model (IIM, also called IPTC-IIM), was introduced by the International Press Telecommunications Council in 1991, originally for the exchange of texts, graphics and photographs in the newspaper world, before it was adapted in 1995 by Adobe for its post production software. IIM can be embedded into some file formats such as JPEG, TIFF and PND (but not in GIF). The Extensible Metadata Platform (XMP) is a multimedia standard conceived by Adobe Systems Inc. and was released in 2001 for the exchange of data sets among its post production tool series. It is compatible with the major file formats. All three are supported by major camera and scanner manufacturers, and also by several software developers which add information on manipulation effected on the image in post production. (For more information on metadata see Photo Metadata White Paper 2007:


4.5 Filming movable slides and optical illusions with a (still-)camera
Only the material aspects of a slide can be reproduced (to a certain extent) by photography. If the intention is to reproduce the visual impression of a slide when projected on the screen, a simple image is not sufficient. A film camera with settings for trick photography should be used to demonstrate e.g. the illusive effect produced by a panorama slides, the simulation of changes in space and time with the help of a bi-uinal lantern or the functioning of a movable slide.


Many slides with lever, rack or other technical options have been created since the magic lantern became a major show attraction. While slipping slides presenting two or three different (fixed) states of a situation can be photographed adequately, rotating ones like the colourful chromatropes, effect slides which let snowflakes fall or make the wings of a windmill rotate, multiple rackwork mechanisms for astronomical demonstrations, motion sequences by the wheel of life, or even experiments with chemical tanks etc. have to be filmed to prove their effectiveness. A panorama slide that is pushed through the beam of the light would also need moving images to demonstrate its functioning. Of course, the windmill or the snowflakes could be digitised in one “frozen” position, and the panorama slide depicted in its entirety or in several shots to break down the action into sequences, but the effect so much appreciated by the audience would be lost.


Today many digital compact cameras are equipped with a video function. This additional tool presents the same problems normal video cameras show, and these sophisticated still-cameras may even have more as they generate still and moving images. In the following some technical aspects, promoted in leaflets and brochures by manufacturers, will be analysed as it is essential to understand in how far the proposed options have consequences for the reproduction of slides.


4.5.1 Some general information


When slides are filmed, visual information is saved in files. For the transportation of the data, these files are put into a recipient called (multimedia or data) a container, or simply wrapper. A wrapper (e.g. material exchange format (MXF) or Matroska) keeps different kinds of information – visual, audio, graphic, textual – together to allow multimedia presentations. The visual data are saved in video compression formats such as MPEG-1, MPEG-2 and MPEG-4 conceived by the Motion Picture Experts Group, H.264 by the Joint Video Team (JVT) or JPEG by the Joint Photographic Experts Group. (This group created the JPEG codec which gave it its name, and the file format which is also called JPEG). A video file has many single frames which can have the “suffix” .cineon, introduced by the Kodak company, or .dpx (the file extension stands for digital picture exchange format) created by the Society of Motion Picture and Television Engineers (SMPTE).


Huge amounts of pixels have to be reduced in size to keep the needed storage space limited, create working files for post production or accelerate the transfer speed of data between two devices of the same workflow. This is done by a mathematical scheme (algorithm) which is used by computer programs and is called a codec (stands for encoder / decoder). A codec encodes and decodes a signal or a media stream. As a codec can also compress data, the artificial word is often understood as a compression – decompression tool. Video codecs (e.g. FFV1, OpenH264) compress raw digital visual data to make them usable for viewing on the internet (e.g. streaming), on a physical carrier (e.g. DVD, BluRay), for storage or transmission. Some compression methods are lossy (those applied in the formats MPEG-1, MPEG-2 and MPEG-4), others compress lossless, e.g. the algorithm LZW, named after its authors Abraham Lempel, Jacob Ziv and Terry Welch. (For more on compression see section “Technical components of digitising an artefact. 5. The compression” in the technical section.)


4.5.2 Colour space and registered light values

Video technique was mainly developed for television, therefore many parameters still refer to its standard. While a digital still-camera allows to store uncompressed raw data, a video camera generally delivers lossy compressed files due to the huge amount of pixels to be stored: e.g. one coloured video frame in the TV standard PAL in Standard Definition (SD) quality has c. 1,3 MB of pixels, one second (25 images) creates about 33 MB. Television standards were agreed on for black and white broadcasting; when colour TV was introduced the broadcasters needed a technique to satisfy all clients whether they still had a black and white or already a colour screen. While photographic and cinema cameras worked with the primary colour system red, green and blue, TV-technicians split colour and light. The first step: a TV camera has three channels, each with a different filter, to create three different black and white images (“colour separation”). Technicians created one channel containing light values, the luminance channel “Y”, which mixed light information taken from the green (c. 58%), the red (c. 31%) and the blue channel (c. 11%). On the old TV sets which had no way to reproduce colour, the broadcasted image appeared monochrome. For colour TV the technicians added two colour channels, the chrominance channels called “Cr” and “Cb“, supplying information on saturation and hue, and adding values for red (Cr) and blue (Cb) as they were totally under-represented in the luminance channel. The colour TV sets at home combined luminance and chrominance signals and composed the colour image, but in a “reduced version”. The green parts of the image were almost correctly reconstructed, but red and blue areas only half. The spectators would not notice, for them the image seemed normal as humans are highly sensitive to green and much less to red and blue. The broadcast image was simply adapted to the eye’s sensibility which takes green for brighter: “[…] the green fraction therefore has the greatest weight by far, whereas red and blue light are taken into account to a much lesser degree. This is the case since our eyes are much less sensitive to blue light; fewer than 10% of our color-sensitive retinal photoreceptor cells detect blue light.” (Nasse 2009, p. 21; for more information see Jarczyk 2017, p. 7-8)


This system is still used by modern digital still-cameras capable of filming videos, but now with digital information instead of the old terrestrial waves sent from the broadcaster headquarters to the user’s antenna. Still-cameras normally use the additive primary colour system RGB (King 2017, p. 185), their sensors capture red, green and blue light values. When taking a video the camera has to work with other parameters which means the RGB output of the sensor has to be transformed by the processor. As during the old television days, today’s video image is composed of luminance and chrominance, expressed as “Y’ Cb Cr”. Digital expert Charles Poynton (2008, p. 1) shows that (almost) nothing has changed: “Video systems convey image data in the form of one component that represents lightness (luma), and two components that represent color, disregarding lightness (chroma). This scheme exploits the poor color acuity of vision: As long as luma is conveyed with full detail, detail in the chroma components can be reduced by subsampling (filtering, or averaging).” “Sub-sampling” means that something is sampled in reduced quantity, here it’s chroma which is “underfed” in samples.


When the ratio is 4:4:4 the sampling of light and colour is equal. A digitising in 4:4:4 means that all the information on brightness and colour of an object have been captured and stored, like when the still-camera is working in the RGB colour space. In delivery, every pixel still has all the original information on light and colour as no compression has taken place. But when 4:2:2 or even 4:2:0 is noted in the manufacturer’s leaflet (which is mostly the case), a part of the information is “destroyed”. The camera sensor will well receive all the information (as an objective may well lose a bit of light due to the construction of its lens-system, but it cannot compress), however the processor applies compression to make the files “lighter”: in the scheme 4:2:2 each pixel of a group of four has still its total luminance, two kept also their colour values, the two others have to “borrow” the information from neighbouring pixels. As to the formula 4:2:0, only one pixel has kept all and serves as reference point when the other three need chrominance information (see “Digital Color Coding”


As already mentioned, the human eye is more sensitive to variations in brightness than in colour and reacts more strongly to the green light band in the visible spectrum than to the other ones. Therefore, cameras use this capacity to reduce the amount of information to limit the needed storage space. The “sub-sampling” formula 4:2:2 signals that 50% of the original colour values are irretrievably lost, in the ratio 4:2:0 only a fourth of the originally received colour information is left, 75% is lost. Compression is guided by the idea that neighbouring pixels carry identical information and therefore “borrowing” is not falsifying the result. However the question is up to which “degree of neighbouring” this is really the case (see also “4.5.4 High compression technique below”)?  The claim of the Dutch photographic project “Metamorfoze” (Van Dormolen 2012, p. 4) would be impossible to achieve: “The preservation masters […] must be of such a quality and measurable relationship to the original, that they can in fact replace it. This means that all the information visible in the original must also be visible in the preservation master; the information transfer must be complete since the original is threatened by autonomous decay and will no longer be used once it has been digitized.” The reduction in colour value may not be visible when the video of the moving slides is watched, but it will have consequences for the future when migration is needed to keep the data “alive”.


The photographers’ guidelines UPDIG (p. 14) state: “Converting to a different color space […] changes the pixel values while attempting to retain appearance.” The transfer from RGB to Y’ Cr Cb may not be noticeable by the beholder, but with every manipulation by an algorithm the risk of artefacts is increased which may ruin the file for long-term archiving. Sub-sampling is considered not “future proofed” as 50% and more of the originally captured images is gone. However, “future proofed” is one of the recommendations for video signals by the Technical Commission of the European Broadcasting Union (EBU). Besides, such a transfer is often a “black box”-operation as the manufacturers do not reveal how the pre-programmed processor has “transcoded” the values (the algorithms are their business secret) which increases the photographer’s risk of losing their collection one day: without the “code” an incorrect reading cannot be counterbalanced algorithmically and may make the archived data inaccessible forever.


4.5.3 Sensor size and aspect ratio

Advertisements for photographic articles are sometimes incomprehensible for amateurs. This can lead to disappointment when the equipment is delivered. When a booklet or flyer praises the advantages of the product it uses technical keywords that evoke associations which may not correspond to what the buyer gets. Some of these incongruities are presented in the following section.


A manufacturer states in his catalogue that his camera with a full-frame CMOS sensor delivers  „uncompressed, YCbCr 4:2:2“ files (Sony 2016, p. 23). However a 4:2:2 signal is per se compressed as the original data are reduced. The same publication also indicates that the camera shoots in “Full HD 1920 x 1080” and “4K 3840 x 2160”. The latter is commonly known as “Ultra High Definition 4K” as it is the resolution all producers of High Definition (HD) television screens have agreed to.


An image in 4K on a cinema camera sensor has horizontally 4096 pixels on 3112 vertical rows and – in “cinema talk” – an aspect ratio of 1.32:1 (almost the silent cinema full-image standard 1.33:1). For projection with a digital beamer in movie theatres, the image frame is cropped to fit the standards of the cinema industry: the formats are called “scope” for which the full width is kept and the height is “shortened” (4096 x 1744, aspect ratio 2.35:1, only 56% of the horizontal lines are active), and “flat” in which both are reduced (3840 x 2160, aspect ratio 1.78:1, 70% active lines). In the above-mentioned case, the promoted still-camera with its 4K-video function captures images which are not better than the small “flat” 1.78:1 to fit the HD aspect ratio of 16:9 (in “video talk”). A technically inexperienced buyer which has followed the “digital turn” in cinema will read the manufacturer’s leaflet as cinematographic “4K” while s/he gets an television image.


It is not clear from the promotion booklet whether the “4K”-image is taken by the total surface of the full-frame sensor and then cropped in width and hight, or whether, right from the start, the sensor’s surface is only partly used. Whatever the case: instead of using the entire surface of the camera’s sensor (36x24mm, ratio 3:2), it occupies what the camera manufacturer calls “Super 35mm”, a surface which is even smaller than the APS-C/DX standard of a still-camera (23,6 x 15,7mm, ratio 3:2). Photographer Julia Adair King (2017, p. 37) therefore calls it a “cropped sensor”. “Super 35mm”, which was once an image format (24,89 x 18,66mm, ratio: 4:3) for movies, has thus turned into a name for a sensor format (e.g. Canon: 26,2 × 13,8 mm, ratio 1.89:1; Panasonic: 26,69 × 14,18mm, ratio: 1.88:1.)


As photographic expert Michael Reichmann (2006) stated: “physically larger sensors will always have an advantage over smaller ones. This means that (other factors aside) image quality from medium format will be higher than full-frame 35mm, which will be better than APS size, which will have an edge over 4/3 [Four-Thirds], etc, etc.” The buyer believes, as a camera with a full-frame sensor was acquired, that s/he is profiting from its high resolution while this is not the case.”  (


4.5.4 High compression technique

When it comes to high compression technique, cameras are all but faithful to the photographed original. They follow the tradition developed by TV stations in the 1950s and 1960s which created standards for black and white and colour television out of technical necessity. In the days of optical fibre cables and high speed internet, the argument for compressing is that it keeps bit-rates low.


The compression scheme 4:2:2 advertised by the manufacturer is only valid when the information is directly recorded on an external hard disk. If an in-built memory card is used, the camera samples at 4:2:0 and uses Long-GOP compression (Sony 2016, p. 22-23). GOP stands for “Group of Pictures”, is part of MPEG-2 (MPEG stands for Moving Pictures Expert Group which developed several video encoding standards) and signals a series of 6, 15 or even 18 consecutive frames which are encoded together and form a unit. In compression, an algorithm reduces the size of a file by “resuming” elements that are considered redundant in the frame itself (intra-frame) or in groups of images (inter-frame) (see section “Technical components of digitising an artefact. 5. The compression” in the technical section).


In his Power-point, Roger Cheng (2007) explains how a GOP-encoding works: it is based on reference to commonly shared information. Lossy working systems such as MPEG contain at the beginning of each series one “reference frame” (or “key frame”) which is encoded intra-frame (therefore the key frame is also called “Intra-frame” or I-frame). The I-frame forms a unit in itself as it contains light and colour values which, if repeated, are compressed inside the picture; it will serve as reference. The other images are encoded with inter-frame coding, thus in relation to the I-frame and others around. These “others” in the Long-GOP sequence are called “B-frames” and “P-frames”: a second frame refers to the first (key frame) as parameter, a third to the second and the fourth as orientation points etc. The set ends when the next key frame is reached. The reference frames signal the elements which all frames of a frame group have in common, as a consequence just one frame still contains these shared values (but intra-frame compressed), in the others they are eliminated. In decompression an algorithm re-establishes the arrangement of the original values in the frame set: how much can be retrieved depends on the chosen compression method. The bigger the group, the more units have to borrow references from one (“B-frame” or “bi-directional frame”) or two (“P-frame” or “predictive frame”) neighbouring frames. Those with one connection reach about one third, those with two about one sixth of the original image information. The effect may almost be unnoticeable to the human eye, however Cheng warns that if “errors” occur they will be repeated in the whole group, and “errors” here means “digital artefacts”.


But there is more at stake: a B-frame stores nothing else than its difference to the referential I-frame (thus still contains original information) while a P-frame refers to the already reduced B-frame (which in size is one third of the I-Frame). In the decoding process the algorithm has to guess what the original form of these files was: it’s a prediction rather than based “on facts”. The more a video film is compressed, the less it contains I-frames and the more B-frames (with just the differences to the original data) and “unreliable” P-frames (with the difference to the difference). A Long-GOP compressed video is not preferable as it is not only “untrue” to what was really captured by the sensor; the further apart the referential I-frames are from each other, the fewer frames with original reference material are available for decoding. The more original data are missing and must be guessed, the higher the risk of artefacts. Lossy compression formulas in still-photography work with groups of four pixels of which an average value is noted to form a block with identical colour and light values (Jarczyk 2017, p. 13) which also leads to hypothetical pictures which can be far from the original. But in a video film the falsification has astonishing proportions. The fact that the values have to be “guessed” has also consequences for the cutting of the film: according to Karl Slavik (2014, p. 51) the montage software decodes the group in which the cut is made; it separates the GOP-series, then recalculates each frame to bring back its values, by this changing them into intra-frames as only this kind of picture can be used in montage. The estimated values are principally based on unreliable neighbouring frames which makes the concerned frames look different from the rest, and the cut is noticeable.


And one has to add: if the sampling rate is 4:2:0 (thus only one fourth of the original colour information is kept) and the video is reduced in size with Long-GOP compression, then how much carries over of the light and colour values of the slide? As this method only presents the “shadow of the original image”, lossy or virtually lossless compression are unacceptable for conservation masters as they are not sustainable. If one can not work with RAW data or (mathematically) lossless compression in the video function, it is important that the camera allows the choice of a compression rate which should be as low as possible. The German Nestor group, working on long-term conservation (Sauter 2010, p. 53), stresses the correlation: the higher the compression, the lower the image quality.


To summarise this point: camera manufacturers promise a lot in their booklets but one has to be very careful and well informed to really understand what state of the art the camera is. When selecting the manufacturer, it is necessary to check with the specialised trade the risk of obsolescence that is related to formats and codecs: in 2014 there were c. 1.200 codecs for video and c. 150 for audio, according to Slavik (2014, p. 50). Numerous questions should be asked such as: how often does a manufacturer bring out new camera models? Have there been major changes? If yes, in which rhythm? etc. And one could reflect about whether it is reasonable to buy a digital still-camera with a video function as the camera software manipulates so strongly the received information (however, a real video-camera does this as well). And ask the experts whether an apparatus which mixes fixed and moving images reduces or augments the quality of the result. Can this result in counterproductive interferences?


In any case, whatever the answer to this question is, it is obvious that a film made on movable slides is excellent for giving access, but not suitable for long-term preservation.


Practical hints given by researchersTo suit archiving conditions, the Swiss cultural heritage institution Memoriav recommends “uncompressed TIFF (16 bit linear)” and “DPX (10 bit, 12 bit, 16 bit)” for frames, “10 bit 4:2:2 uncompressed (e.g. v210)” and “10 bit 4:4:4 uncompressed (e.g. v410)” for video codec, Matroska and MXF (Jarczyk 2017, p. 44-46) for the wrapper.


4.6 Which resolution for the digitising of slides?
Before addressing the issue of slides, it is interesting to briefly consider how the question of best resolution was dealt with in neighbouring business sectors.


For the scanning of film material, archives and film stock suppliers determined the relation between the frame size of a specific gauge and the minimum resolution for its reproduction in Standard Definition (SD), High Definition (HD), 2K and 4K. By doing so, they set standards: a 35mm-film frame (24,9 x 18,7mm, aspect ratio 1.33:1) that are today scanned in 1920 pixels per horizontal line and 1080 pixels per vertical colon for the use on HD-TV, and in 4096 x 2160 pixels for movie houses’ 4K as purported by the Digital Cinema Initiative (DCI), a group of major film studios in Hollywood, which published its specifications for digital cinema in 2005. The group’s economic and political power was so important that today every movie theatre in the world is equipped with devices following the “DCI system requirements”. In “photographic talk” this would mean (rounded-up): 2,1 MP per frame in the HD-TV format and 8,5 MP per frame in 4K-size. This is much less than what a full-frame sensor of a still-camera (36 x 24mm) has to offer: e.g. one producer proposes for its camera series between 12,2 and 42,4 effective megapixels per full-format sensor, also rounded up (Sony 2016, p. 74).


Photography and later film used the emulsion’s potential to reproduce lines (or line couples) per mm (lp/mm) for the evaluation of its “subtlety”. The reproduction accuracy depended on the emulsion’s sensitivity, based on the amount and the size of the silver grains contained in the gelatine layer covering the film base. This traditional parameter was used to calculate the best resolution for digital film cameras and film scanners: the average size of a silver grain was between 2µm and 16µm (some had even only 0,2µm), the average size of a pixel was estimated best when between 6µm and 8µm (see  As to the minimum “sensibility” of a sensor’s lens expert Hubert Nasse (2009, p. 6) states: “It takes at least two pixels to display a line pair made up of a bright and a dark line.”


The discussion about the best scanning resolution took the film archives several years to establish, as they generally deal with historic prints: they had to consider how the technique had evolved during the last 120 years of cinematography and answer questions such as: which raw film stock emulsion quality was possible in which decade? which camera lenses were in use when the films were made? And they handled mostly distribution copies whose quality depended on the number of “generations” (camera negative and generally two intermediates before the print was made). As to raw film stock, the European Broadcasting Union (EBU) in their directive “Tech 3289” from 2001 stated: “Experience […] shows that a pixel pitch of 6µm (about 160 pixel per mm) is considered sufficient to reproduce current film stock. This corresponds to a scan of 4k x 3k (actually 4096 x 3112) over the full aperture on 35mm Film. If it is scanned at lower resolution (corresponding to a larger pixel spacing), less information is captured […].” (EBU 2001, p. 60)   ( In the end, to agree to standards normally means that a minimum quality is assured, but the photographer can always do more. As the film experts Paul Read and Mark-Paul Meyer write: “The best resolution needed for any specific film type is largely a matter of guesswork and experiment.” (Read/Meyer 2000, p. 221)


As to slides, no standards yet exist. A glass negative (first generation) has all the characteristics of the photo-camera it was taken with and the sort of emulsion that was chosen (or self-prepared); the image on a positive glass plate (second generation) has lost a bit in sharpness, resolution and contrast, but can still be considered close to the original. As many slides and slide series were photographed, the standards for photographic reproduction could be applied to all types as photographic slides present generally more details than painted or printed ones, thus they should be taken as yardstick. Although as to our knowledge no international standards were agreed on by the photographic community, several organisations published what they thought was the best.


Two examples: in the past, the huge Dutch digitisation and preservation project “Images for the Futures – Beelden voor de Toekomst” could constitute a model simply by virtue of the gigantic numbers of photographs to be digitised. In its Concept Richtlijnen digitalisering (2007) the Dutch National Archive suggested the following: for photographs and negatives in sizes from 18 x 24mm up to 4 x 4cm 2.400 ppi (with 94,5 pixels per mm and a resolving power of 47,2 line couples per mm) and for picture sizes between 4,5 x 6cm and 6 x 9cm a resolution of 1.200 ppi (with 47,2 pixel/mm, 23,6 lp/mm). This was its ideal scenario. When finally, in 2009, the Nationaal Archief, Nederlands Instituut voor Beeld en Geluid and Netherlands Filmmuseum commonly published the European tender for the digitisation of their collections, they had agreed on a much lower standards: 600 ppi for images between 1,8 x 2,4 cm and 10 x 12,5 cm (with 23,6 pixels/mm, 11,8 lp/mm) and 450 ppi for pictures between more than 10 x 12,5cm and less than 18 x 24cm (with 17,7 pixels/mm, 8,85 lp/mm). (Tender text in the possession of the author.) Financial, technical and practical aspects may have been the reason for this decision.


Today, the Memoriav report on photography from 2017 signals a minimum resolution for diapositives and negatives: “24 x 36mm 4.800 ppi [pixels per inch], 6 x 6cm 2.000 ppi, 6 x 9cm 2.000 ppi […] 10 x 15cm 1.200 ppi, 4 x 5 inchs 1.200 ppi, 13 x 18cm 1.200 ppi” (Jarczyk 2017, p. 37). It is not specified whether paper or glass negatives are meant, but this overview gives an indication on the minimal requirements to be fulfilled by a sensor. It is not specified either why bigger formats should be reproduced with a lower quantity of pixels per inch, but it is reasonable to think that the authors were conscious of the increasing amount of megapixels which, from a given size of the picture onwards, would make the files too “heavy”.