Technical components of digitising an artefact
If we want to understand the digitisation of a slide with a flatbed scanner or a still-camera we need to grasp the basic elements of this action: how information on light and colour values such as hue, saturation, brightness that form the material qualities of the original slide is “translated” into binary code, into series of numbers composed only of two numbers: 0 (zero, electric signal is absent and/or low) and 1 (one, electric signal is present and/or high, above a designated level of electrical charge). Several steps are necessary which involve sensor, amplifier, converter and processor as part of the equipment of the “working space” and screens (monitor, display) and beamers used in the “output space” (reference terms taken from UPDIG, p. 5).
- The image sensor
It helps to understand how the material properties of a lantern slide – e.g. transparency / opacity, presence / absence of colour in the surface, texture of the glass carrying the image, contrast / sharpness etc. – are registered with the help of a white light beam and a light-sensitive surface. Light which illuminates the slide and is reflected, is captured by the light-sensitive element of the scanner (or the still-camera), the so-called “image sensor”. It has a “photo-active” zone with light-sensitive elements, the so-called photo-diodes (or silicon diode photosensor) or pixels (pixel means “picture element”), forming a grid which registers rays of a specific spectrum of the (visible) light. The potential of the sensor is (partly) responsible for the quality of the reproduction.
1.1 The construction of the sensor
Most flatbed scanners and some still-cameras work with a charge-coupled device (CCD). CCD sensors have pixels that are built in several layers: 1. light collecting lenses on, 2. filters in the three primary colours red, green and blue (hereinafter RGB); 3. they cover light sensitive photo-diodes which capture the light. When the scanner is active, the light beam falls on the collecting lenses which concentrate the light rays onto the red, green or blue filter behind them which separate the light as a photo-diode which ordinary, not specially sensitized silicon can’t distinguish between different wavelengths. These filters segment the white (consisting of all colours) beam into three different colour bands. Light has waves of different lengths; the human eye can only see a particular section of the whole range reaching from dark red on one side of the spectrum to violet on the other. A rainbow shows them as clearly separated bands of orange, yellow, green, blue and indigo, but normally colours appear as mixtures of different wavelengths according to the light-rays that are reflected by an object. Colours are clearly defined: what we call e.g. green goes from bluish green of 492nm (one nano-meter = the one million’s part of a millimetre) and a frequency of 526 THz (terahertz) via “pure green” of 530nm to yellowish green of 577nm with 612 THz (http://www.physlink.com/education/askexperts/ae2.cfm). A filter on top of a pixel lets only pass a certain range of light-waves, the “light band” it was produced for (e.g. 510-570nm).
Thus each of the RGB-filters lets pass rays corresponding to its own colour and absorbs the others: e.g. long red waves are accepted while shorter green and blue waves are stopped by the red filter. This results in three “colour separations”, three grey reproductions of the scanned slide. Each shows many shades of grey corresponding to various light intensities as a CCD-sensor reacts to differences in brightness. The three pictures of the slide have clear(er) areas where the light was let through and dark(er) parts where it was absorbed; each image looks slightly different according to the areas on the slide in red, green, blue particular to the blend from which light rays were received and stored.
The photo-diode of a CCD sensor is a semi-conductor which transforms optical signs into electric power. It is built of silicon made light sensitive which can capture much more (190-1.110nm) than the whole spectrum of visible light: c. 380-770nm (see Aumont 1994, p. 81), with a normal sensitivity between 400 and 700nm (see Myers 2000, p. 10). Thanks to its covering filter the diode reacts only to the visible segment and excludes infra-red and UV-frequencies.
The nature of light is complex. In addition to the length of its waves and their oscillations per time unit (frequency) it has a third quality: energy. Light has energy, thanks to so-called “photons”, positive loaded “particles” which, according to Michael Langford (2008, p. 25), can bleach colours (as one can see when leaving a coloured sheet some time next to the window), and which are responsible for the “electrical reactions” in the sensors. The stronger the light, the more photons are present, the more energy is received. This energy is measured in electron-volts (eV): e.g. green light has an energy which ranges from 2,18 to 2,53 eV (https://rechneronline.de/spectrum/). High-energetic photons hit a photo-diode which generates an electronic impulse; electrons are absorbed and accumulated inside the pixel in the “charge storage region” for a very brief moment. The created electric charge, which is proportional to the intensity of the received light, is transferred to an amplifier at one end of the sensor.
A photo-diode can have a square, rectangular or multiangular front part which can only be seen under a microscope as it is many times smaller than a human hair. As IT-specialist Donald P. D’Amato (2000) remarks the form does not mean that the reproduction represents “[…] an average value of the original object’s reflected or transmitted light intensity within that rectangle. In actuality, the sensors in most digital image capture devices do not ‘see’ small rectangular regions of an object, but rather convert light from overlapping nonrectangular regions to create an output image.”
The greater the surface of a pixel, the more light can fall on it, the more reaches its inner side (thus the more “light sensitive” the pixel), the greater the amount of created data and the broader the dynamic range. In the “passive-pixel sensor” of a CCD all the electric charges created by the photons are just “read out”: each photo-diode is animated to send its charge to the neighbouring one (thus the name “charge-coupled device”) until the current has reached the end of the row from where it is sent to the amplifier which is situated in one corner of the array. The amplifier converts it into voltage. The read-out has to be quick to make room for the next charge created by arriving light. The one-dimensional sensor of a line-scanner has only one row and is quickly discharged, the array of a two-dimensional sensor is read out row by row, and the amplifier produces a whole series of voltage. The voltage is then transferred to a so-called A/D transformer.
1.2 The A/D transformer (Analogue-to-Digital converter)
The function of the converter is the transformation of the captured analogue information into groups of zeros and ones to fit the needs of the digital binary system. (This resembles to a certain extent to what the human eye does when it “converts” optical stimuli into neural impulses to feed nerves and brain.) As the electrical charge transferred into voltage is still an analogue signal, it has to be converted into digital data which other devices can read. All digital machines work with the “binary code”, a special language with a “vocabulary” composed only of zeros and ones. The binary code communicates absence / presence of energy and whether the received signal is of low or high energy. A low signal will be noted as a value of 0, a high signal will create a value of 1 (https://techterms.com/definition/integratedcircuit). The A/D converter has transistors which capture these signals and react by turning a signal on (one) or off (zero) (https://techterms.com/definition/transistor). As the amplifier rhythmically sends charges collected by the pixels as voltages, the A/D converter evaluates the received electromotive force (the process is “called “quantification”) and assigns a binary code to each voltage proportional to its intensity. The materiality of the slide is now readable by computers.
1.3 Other sensor models
The Complementary Metal Oxide Semiconductor or CMOS sensor, also called “APS” sensor, is an alternative to the CCD sensor. As “passive-pixel sensors” (CCD) were relatively slow in reading out the charges, “active-pixel sensors” (APS) were invented where each photo-diode has its own in-built amplifier. Thus it has an additional fourth layer: an electronic unit which, in a classical CMOS sensor, sits between the colour filter and the silicon layer (Wunderer 2015, p. 15). In a sensor with so-called “backside illumination” (BSI-CMOS) the photo-active silicon zone is right behind the filters to enable it to capture more light. In the case of the Complementary Metal Oxide Semiconductor the charges are not coupled; each photo-diode has its own amplifier (the “metal oxide semiconductor”, an expression for “transistor”) and can convert the received electric charge into voltage (Eibelshäuser 2005, p. 226-227). Like pixels, transistors are semi-conductors made of sensitized silicon and small control modules. As each pixel is equipped with a transistor they form together a so-called “integrated circuit” (IC), a network of units which act in common as processor. (A processor, short for central processing unit, forms the head of each digital or electronic device; it performs “operations on (data) according to programmed instructions in order to obtain the required information” (Collins 1988, p. 637)). A pixel on a CMOS sensor not only collects charges and transforms them into voltage but can also perform the transformation of analogue data into digital codes.
The Contact Image Sensor or CIS sensor is smaller than a CCD sensor. It is based on the CMOS construction and was integrated in fax machines before it was used in flatbed scanners. It normally has one row of several thousand pixels, above it sits an array with a huge number of lenses, next to it a third row is equipped with “light-emitting diodes” (LEDs) sending out red, green or blue light. The scanning unit thus combines a light source, a lens and a sensor. It slides across and directly under the cover glass, almost in contact with it, as its name – contact image sensor – indicates. A flatbed scanner with a CIS sensor is inexpensive to build as it does not require the mirror-set needed by the CCD scanner. Instead of the single lens the CCD works with, the CIS sensor uses many fibre optic lenses all arranged on the width of the moving rack which avoids distortions, a risk that concerns the single lens of a still-camera or a scanner with a classical sensor (CCD, CMOS). However, its depth of field is very poor, the colour space is reduced, and it makes more noise. This sensor type can’t be used in photography. (Some illustrations can be found here: http://www.mitsubishielectric.com/bu/contact_image/general/; https://www.ricoh.com/technology/tech/028_scan.html. “Noise” is a technical term used, for instance, in electronics: in our context it means that an unwanted energy interferes with an electrical signal, disturbs it and makes its correct reception impossible. It seems that “noise” is derived from the Latin word “nausea”.)
The Super HAD CCD is just a CCD sensor where the pixels are closer to each other which enlarges the effective surface of the sensor if more pixels are placed; the principal idea is to leave less space around the picture elements so as that light does not fall between them but rather onto their lens (https://www.hkvstar.com/technology-news/comparison-between-ex-view-had-and-super-had-ccd-cameras.html).
Next to CCD and CMOS, which are “single-color sensors”, still-cameras can have another sensor: The X-3 or “Foveon sensor” called after the company that invented it about 15 years ago. The “Foveon X-3 sensor” imitates the well-known blue, green and red colour layer-system of the traditional tripack film material. The X-3 combines three photo-diodes made of different semi-conducting material, each sensitive to another segment of visible light. These three are placed one on top of the other. As with the light-sensitive layers of the tripack colour emulsion, white light travels through the layers of pixels. Each detects the intensity of the component it is sensitive for. The light-waves enter differently deep into the sensor construction and create three colour separations. As the X-3 has no need for a Bayer mask, interpolation is not done. The digital reproduction is based on actually received colour values, not on guessed ones, thus this technology reproduces colour most accurately with finer details and brighter images as no light is lost due to absorbing filters. On the other hand, the sensor produces more noise with higher ISO numbers. The company which uses the “Foveon” counts each (microscopically) visible pixel on the sensor surface three times, thus the number of pixels that come in contact with light are one third of the indicated numbers of megapixels. According to the specialised trade “Foveon” is less used nowadays. John Hedgecoe (2004, p. 409) indicates that the “X3” works like the CMOS sensor with transistors for each photo-diode.
It seems that alternatives are currently studied such as the so-called “Transverse Field Detector” (https://de.wikipedia.org/wiki/Active_Pixel_Sensor), a sensor using a dichroic prism instead of filters to split colours, and another one with clear instead of green filters to capture more light which results in brighter images (https://www.thephoblographer.com/2013/07/31/an-introduction-to-and-brief-history-of-digital-imaging-sensor-technologies/). The “Organic CMOS sensor” following the example of Foveon was presented beginning of 2017 but the manufacturer announced that he considered it “too early” to bring it on the market (https://www.fujirumors.com/fuji-says-its-too-early-for-the-use-of-organic-sensors-and-they-sold-mroe-x10-and-x100-than-they-expected/). The industry is trying to improve the quality of their sensors. Consequently, more solutions will come up in the (near) future.
In principle a sensor acts like the human eye: on the back of the eye there are three types of cones, or cone cells (“photo-diode”), each sensitive to a group of similar wavelengths of the visible light spectrum (long, medium-long, short waves) according to the pigments they possess (“filter”). They receive the light and send a “colour separation image” to the cells behind (“amplifier”, “A/D-transformer”) which transfer the information via the nervous system (“wire”) to the brain (“processor”). (For more information on how brain and cones work together to make a human being perceive colour, see Zeng (2011), p. 9-10.)
1.4 Advantages and disadvantages of the two most used sensor models (CCD and CMOS)
According to evaluations found on the internet both models have positive and negative qualities.
– CCD sensors capture generally up to 70% of the incoming light, create high quality pictures as they generally have more pixels and produce low noise when working under bad lighting conditions. This makes them ideal for still-photography. Due to their simple structure they are less costly to produce, but real energy guzzlers (100 times more than an equivalent CMOS sensor), are slower in reading out due to the “bottleneck” (many pixels pass by the same amplifier), have a higher risk of “blooming” (unwanted white lines that can be produced by the move of the charges to neighbouring pixels), use just one amplifier for the whole captured energy, make more noise, have their highest sensitivity in the green segment of the light. As the pixels have no other task than to receive light they have an “output uniformity” which is relatively high compared to a CMOS sensor (https://emergentvisiontec.com/blog/what-is-a-cmos-camera/).
– CMOS sensors use little power, they produce less heat (making them better suited for cameras), but more noise (therefore they have an in-built noise-corrector), capture less light in its classical version (photons often hit the electronic unit directly behind the filter instead of the photo-diode), read out faster as each pixel has its own amplifier and can receive quicker new charges, many have a higher sensitivity to the red sector of the light, the amplifier partly acts as a processor (“high dynamic range CMOS”) and manipulates data, the pixels can differ in their light sensitivity. As each CMOS pixel captures not only light but works as an amplifier (electric energy to voltage) and as a converter (analogue to digital), each pixel can be slightly different in power than the neighbour; this causes that the “output uniformity” of a CMOS sensor is relatively low (https://emergentvisiontec.com/blog/what-is-a-cmos-camera/).
2. Colour capturing and Bayer mask
It is obvious that filters play an important part in sensor technology. The human eye is more sensitive to middle-long green light-waves, it notices differences in the green gradient quicker and recognises green shapes more easily than it does forms emitting long “red” light rays; the eye acts comparatively insensitively to the short waves of the blue light. The filters on the sensor are organised according to a pattern. Many CCD and CMOS sensors have what is called a “Bayer filter”, a kind of mosaic covering the millions of pixels. A Bayer filter or “Bayer mask” is an array containing 25% red, 50% green and 25% blue filters, forming a grid of regular rows over the whole photo-active zone. These filters cover pixels that are arranged in square groups of four (2 green ones, 1 red, 1 blue) and work together (Wunderer 2015, p. 14). Each point on the slide corresponds to a specific group of pixels on the sensor which has captured its light-values four times.
Each of the three colour filters absorbs light waves from colours that are different to its own, thus c. two thirds of the light reflected from the slide are not captured. Each pixel under this filter mosaic can only register information that has passed the optical barrier. The quantity of data is unequal due to the different number of filters per colour: the sensor notices green twice (50% middle wavelengths) but just once the short (25% blue) and once the long (25% red) wavelengths. The missing two parts of red and blue have to be added artificially by calculation called “interpolation” (meaning the addition (or subtraction) of pixels). Information that the neighbouring sensor units have captured is used. (For details on processing methods to retrieve information and their consequences see D’Amato 2000) https://www.oclc.org/research/publications/library/visguides/visguide3.html
Interpolation can produce artefacts as it uses a mathematical activity to compose colour values and not digitally retrieved information. Also, the information the calculation is based on is insufficient, therefore the result is not only hypothetical – photographer Tom Striewisch (2009, p. 40) sees the word interpolation as a euphemism for “guessing” –, it may also be incorrect and generate shades that didn’t exist on the slide. According to photographer Tom Ang (2006, p. 325) interpolation decreases the image quality as it produces a loss in sharpness (see also Kraus 1989, p. 176). Photographer Eib Eibelshäuser (2005, p. 238) states that interpolation not only augments blurring but also produces chromatic noise. IT-specialist D’Amato (2000) adds that interpolated resolution “[…] should be avoided since it only increases the size and processing times of the image but does not enhance its quality”.
Resolution means “the act of separating something into its constituent parts of elements (Collins 1988, p. 725). The “resolution” of a scanner and camera sensor (or of a monitor, printer) signals its capacity to reproduce details from an original item distinctively and sharply. It depends on the number of sensitive points (pixels) arranged per inch (1 inch = 2,54 cm) on a grid structure capable of measuring light information (for monitor and printer: reproducing these information). Depending on its resolution the device’s sensor has a certain amount of light sensitive units, e.g. a scanner for movies has a resolution of “2K” (c. 2.000 sensitive elements on a horizontal row) or “4K” (c. 4.000 per line). A two-dimensional object has a grit which has X times 2.000 or 4.000 light-sensitive points, according to the number of horizontal rows it has. An example: scanners for 35mm-film prints which produce reproductions in 2K deliver images in 2.048 horizontal by 1.556 vertical picture elements, thus 3.186.688 pixels (or 3,1 megapixels) per film frame, in 4K it is 12.746.752 pixels per frame.
The resolution of a scanner or camera sensor (“scan width”) indicates the number of light sensitive units it can use to take samples from the artefact; the resolution of a monitor indicates the quantity of light units it has at its disposal to show the reproduction (“image resolution”). In order to make devices comparable, the number of light-sensitive or light-reproducing points per inch is used. Image scanner producers communicate information about the resolution sometimes in counting units for printing such as dots per inch (dpi) indicating e.g. the number of jet nozzles of a home or professional printer, and lines per inch (lpi), for example the number of visible lines (or rather couple of lines formed by one in black and the neighbouring one in white) on a grid. 600 dpi means that (with a good printer) 600 distinct dots can be printed on a line of one inch, or, as one inch equals 2,54 cm, 236 different visible points are aligned on one cm. Sometimes they choose parameters for visualizing devices counting in pixels per inch (ppi) for e.g. a computer screen, sometimes they measure in samples per inch (spi) to indicate the registered tonal values of a scanner. The sampling rate gives the “number of elements per distance unit” and ”indicates how many picture elements exist both horizontally and vertically in an image” (Van Dormolen 2012, p. 24). D’Amato (2000) and Williams (2000) call it “optical resolution” and “(spatial) sampling frequency”, they describe it in other words as “density of the pixels” on a given “unit length on the document”, and they insist on the periodicity of the sampling.
Flatbed image scanners have a resolution per row of 1.600-3.200 spi in the low end- and 4.800-5.400 spi in the high end-version, film scanners in 2K about 2.000 and in 4K about 4000 spi, they can thus take 2000 samples per inch. As to still-cameras they indicate the file’s total amount in megapixels (MP), which makes a comparison difficult. The resolution of the scanner is decisive for what can be done with the scan later: a high resolution allows the longer visioning of sharper details when the image is “zoomed in”, to print the scan e.g. in poster-size without it showing the pixel grid and staircasing when it comes to straight edges, or to project the image enlarged without it getting out of focus.
For devices with a sensor (camera, scanner), the manufacturing company generally indicates its size (e.g. 2.048 pixels x 1.556 lines), while for printers it gives its number of jet nozzles in dots. As to the information on scanners, digitising expert Don Williams (2000) utters a warning which may not be outdated: “Occasionally, document scanner resolution is specified differently in the two different directions of the scan; for instance, 600 x 1200 dpi. Although both values are considered optical resolutions, the higher one is usually achieved through a higher sampling rate that outpaces the MTF performance. The lower of these two values, associated with the sensor pixel pitch, is probably a better indicator of true detail capture abilities. Most resolution claims greater than 600 dpi should be viewed with suspicion.”
After digitising the file contains the “optical resolution”, that is the digital form of all which was captured by the device. When a file is reworked e.g. by adding pixels to the scan by interpolation (to change e.g. its aspect ratio, size, colour depth) it has a so-called “physical resolution” (Kraus 1989, p. 90). Digitising expert Williams (2000) names it “interpolated resolution”, and he sees its “benign” effects (e.g. repairing small flaws) but also its negative consequences: “[…] using it as a ‘pixel-filling’ utility to inflate resolution claims is misleading at best.” He refers here to pixels which are filled in where before there were none.
As seen above the term “pixel” can refer to different elements: the light-sensitive photo-diode on a sensor in a scanner or still-camera is called pixel, but it is also used to describe the resolution of a taken image, and it is also applied to the lightning units on a digital screen. Therefore, the photographer Julia Adair King (2017, p. 115) suggests that a distinction be made between “image pixel” and “screen pixel”.
4. The data (or bit) transfer rate
The pixels have to be transferred to a computer where the information can be processed and worked with. Bit transfer rate (also called “bit rate”) refers to the quantity of data that can be transferred in a distinct period of time (normally a second) from one medium to another, it refers to the relation between the amount of information in a continuous stream and the speed with which data is travelling between sensor and computer and/or registered on the carrier (hard disk). (In a still-camera this concerns the transfer from the sensor to the memory card, then from this smart card to the computer.) It is counted in bits per second (bps or bit/s), megabits per second (MBit/s or Mbps) or megabytes per second (MB/s). The more (uncompressed) data, the slower the transfer pace. The bit rate depends on the “arrangements” of the pixels: compressed / not compressed, heavily or lightly compressed. Files can be really big as they contain much information (brightness, colour values etc.). To give an idea: an image in High Definition (HD) quality as needed for Television would be 1920 pixels in length x 1080 pixels in height x 3 for the three colour values per pixel, in total 6.220.800 pixel which divided by 1.000 is 6.220 Kilobyte (KB), or divided by 1.000.000 is 6,2 megabyte (MB) per file. High end scanners can digitise an image in up to 100 MB (Scheuer, Kettermann 1999-2000, p. 209). So compression is necessary to make the transfer quicker, otherwise too much time is spent with waiting or the system can’t cope with the size of the data package.
Compression means the reducing of data with the help of algorithms that reorganise data according to redundancy in light and colour values. It is normally part of encoding which is “the process of putting a sequence of characters (letters, numbers, punctuation, and certain symbols) into a specialized format for efficient transmission or storage. Decoding is the opposite process – the conversion of an encoded format back into the original sequence of characters.” (http://searchnetworking.techtarget.com/definition/encoding-and-decoding) A widely used example for an encoding is the ASCII (American Standard Code for Information Interchange) used by computers for the letters of the alphabet, numbers and symbols.
Data are often compressed for an easier exchange of content between a variety of devices. Often files are reduced with the lossy JPEG codec. The JPEG algorithm divides the image into separate sections of 8 x 8 pixels, then it applies a mathematical formula to each block and downsizes its data quantity. When the compression is high, the pixel groups become visible when the reproduction is enlarged on the computer monitor. The resulting compressed information is stored in file formats such as JFIF (stands for the American JPEG File Interchange Format), JPEG2000 or SPIFF (Still Picture Interchange File Format, developed by International Telecommunication Union (ITU), International Standards Organisation (ISO) and International Electrotechnical Commission (IET) completed in 1996). They are recognizable by their filename extensions (.jpeg, .jpg, .jfif, .JP2, .spf).
Alternatives are the lossless (“information-preserving” or “reversible” as D’Amato (2000) calls it) compressed TIFF (.tif; stands for Tagged Image File Format, compresses with LZW), sometimes GIF (.gif; Graphics Interchange Format, which was designed specially for graphics) and PNG (.png, Portable Network Graphics), both lossless compressed with LZW algorithm (named after the creators Abraham Lempel, Jacob Ziv and Terry Welch). Some are open standard (e.g. GIF), but most are owned by companies or institutions (e.g. TIFF was defined by Microsoft, Hewlett-Packard and Aldus). The Swiss media conservation association Memoriav (Jarczyk et.al. 2017, p. 11) recalls that “proprietary codecs and file formats” are made by and in the interest of the industry “as it gives it commercial control and generates dependencies”.
The compression can be lossy, which indicates that information will be lost forever due to the compression (therefore D’Amato (2000) calls it “non-information-preserving” and “irreversible”). It can also be (mathematically) lossless, thus totally retrievable when the information is decompressed to bring the condensed image back to its original amount of picture elements. “Visually lossless compression” means that the data volume is definitely decreased, but it is only recognizable for the trained gaze of the expert. As the human eye is less sensitive to fine variations in colour than to small changes in brightness (discontinuities here are easier to notice), the chromatic data are more strongly reduced than the luminance data.
The compression can be done inside each image file (intra-frame) or – for moving images – in groups of image files (inter-frame) and concerns elements that are considered redundant: by “assembling” recurrent patterns (pixels from neighbouring parts of the slide’s surface containing the same information) the algorithm mathematically compiles repeated information in one unit (group of pixels). Pixels of a uniform wall or a perfect blue sky are easier to compress than heterogeneous surfaces such as slides where even black parts can vary considerably, due to scratches, partly detached dye or uneven hand-colouring. (Therefore JPEG-files from the same sensor can be quite different in size.) This procedure can produce compression artefacts: visible errors when the file is “read” such as e.g. visible “blocky” pixel patterns in certain areas of the image. The “blockiness” is caused by the fact that each picture is divided into blocks of 8 x 8 pixels. These blocks are submitted to compression, but, according to Eib Eibelshäuser (2005, p. 260), each pixel group is encoded independently from the neighbouring ones. The higher the compression, the more blocks become visible.
Of all the mentioned file formats JPEG (sometimes called JPG) can be most compressed which has its consequences. Wayne Fulton states: “JPG is mathematically complex and requires considerable CPU (Central Processing Unit) processing power to decompress an image. JPG also allows several parameters, and programs don’t all use the same JPG rules. […] Final image quality can depend on the image details, on the degree of compression, on the method used by the compressing JPG program, and on the method used by the viewing JPG program.” And he adds that with JPEG colour images compresses “[…] better than grayscale, so grayscale doesn’t decrease as much” but that the algorithm tends to produce “some false color or color changes” (http://www.scantips.com/basics9j.html). (For more information on the advantaged and disadvantages of files formats such as JPEG, TIFF and also on RAW-data see “Compression and file formats” in the photographing section.) Thus due to its insensitivity, the loss of information may not be noticeable for the eye, and the result may seem “good” at first, but lossy data compression is not sustainable as it is not suited e.g. for data migration. (To avoid losing data, migration is needed when the original data-carrier is no longer supported by software and / or hardware. It is recommended to migrate files regularly.)
The selected file format thus determines whether a part of the data on colour and light will be “thrown away” or everything be kept for future activities such as migrations to new carriers. Scanners propose mostly JPEG, TIFF or RAW, some GIF and PNG. Helmut Kraus (1989, p. 60-65) mentions older ones such as e.g. Bitmap (.bmp) used for Windows, Desktop Color Separation (.dcs) and Encapsulated PostScript (.eps) for layout and printing, PICTure data format (.pict) for MacOS or Scitex Continuous Tone (Scitex CT, .sct, .ct) by the company Scitex, but they may not be widely shared. The choice is quite limited and has to be done carefully. The selection of a file format is the answer to questions concerning what to do with the reproductions. It is the result of different needs that the files have to fulfil: long-term storage in a back-up archive off-line (e.g. on Linear Tape-Open (LTO) tapes) and sustainability (both require uncompressed files), post production and flexible transcoding to other file formats (work best with uncompressed files or minimal compression) and short-time storage (e.g. digital asset management in form of a near-line archive using a tape library), publication online (as part of a “digital vault” (database) or on a websites) and storage in the cloud (in a compressed version to keep the rent down, stored in an unknown location, and the long-term security depending on the business model of the provider) etc. (For more information on preservation of data streams see Aeby et.al. (2017), p. 43-45).
Before using compression when setting the scanner, it is useful to think of the consequences for the archive: “The creation of collections of digitized […] archives is emerging as a ubiquitous component of cultural heritage programs of all types and sizes. Digital collections […] might also be considered archival collections in their own right – representing the archival characteristics of original source materials, but also reflecting the decisions that archivists make throughout the digitization process. As archives, collections of digitized content are subject to the same tests of quality, integrity, and value as traditional archives built organically over time by organizations and individuals.” (Conway 2011, p. 112) Would anyone keep books of which every third page has been torn out?
Hints by the Information Technology expert:The quality control of a compressed image can be done by “zooming in” on the picture. Wayne Fulton suggests: “Then examine both large and small file images side by side on the same screen, by zooming in to about 4 times the size (400%, huge) on both. You will have to scroll around on them, but the 400% is to help you learn to recognize the artefacts this first time. The differences you see are the JPG artifacts of compression.” (http://www.scantips.com/basics9j.html) On his website the author shows several examples of JPEG artefacts and explains what to look for. |
---|
Most still-cameras, but also scanners (even some printers) work with the three primary colours red, green and blue which are combined in the colour space called “RGB”. It is today’s dominant colour space. A “colour space” is a special segment of the visible spectrum of light (not to be confused with colour models – the “additive colour model” RGB for photography and the “subtractive” CMYK for video and television – which indicate the method by which colour is composed). Colour spaces subdivide the visible spectrum “into perceptual quantities […] to quantify the human visual system” (Kim et.al. s.d. [2009], p. 1). However, when a quantification is done, e.g. by decomposing the whole visible spectrum into 31 equal intervals of 10nm bands (Poynton 1997, p. 3), it is just an artificial construct, as in reality “the spectrum is continuous and there are no clear boundaries” (Zeng 2011, p. 5).
Humans differ in their quantities of cones: according to Min H. Kim and his colleagues the estimated average is that for 40 cones stimulated by long waves, 20 are sensitive to middle-long waves and one is excited by short light frequencies. Therefore, when in 1931, the Commission Internationale de L’Éclairage (CIE) developed the first colour space, they adopted a standard which was conceived for “a hypothetical Standard Observer” (Poynton 1997, p. 3). Thus a colour space is a deliberate limitation of the visible segment of light, it is based on a human-made concept and is normally determined by an association or a company as the colour range their products will work with. Therefore, colour spaces are often proprietary. They have names such as ECI RGB (European Colour Initiative, by German publishing houses for printing), ProPhoto RGB (by Kodak for digital cameras with 16-bit colour depth), SWOP CMYK (Specifications for Web Offset Publications, created in 1974 for printing publications), CIE XYZ and CIE RGB (designed in 1931 by the CIE), DCI-P3 (by Digital Cinema Initiative for film production), Adobe RGB (1998) (for printing) and Adobe Wide Gamut RGB (its successor) (http://www.brucelindbloom.com/). They vary in the depiction of range, shades (red shades are difficult to reproduce) and saturation of colour.
Sometimes a colour space is also called “gamut space”. A “gamut”, which means “the entire range” of a “keyboard” (e.g. of notes, emotions), defines the scale of colours that a system has at its disposal; compared to its reference point (e.g. the entire visual spectrum) it contains only an “extract” of all possible colours the human eye can see. ECI RGB, ProPhotoRGB and Adobe RGB have a “large-gamut space” which means that they include a wide range of light bands (UPDIG, p. 6).
The proprietary “Adobe RGB” is considered a “sub-variety” of the RGB working spaces, as is “sRGB”, standing for “standard Red Green Blue” and created by Hewlett Packard and Microsoft in 1996 for their products. It is used for monitors, printers and the internet. “sRGB” is considered a “narrow-gamut space” (UPDIG, p. 6) and satisfactory for low-end products for the home use. The terms “wide” and “narrow” can refer to the “color space chromaticity diagram” which CIE designed in 1931 (composed of the “CIE 1931 RGB colour space” and “CIE 1931 XYZ colour space”). As it describes all visible colours, a comparison with the range of colours in the CIE’s diagram shows whether a colour space can be considered broad or small. Adobe RGB is a colour space created in 1998 by the software producer of the same name which introduced it for printing; therefore it is more precisely called “Adobe RGB (1998)”. Photographic expert Julia Adair King (2017, p. 185) recommends not to work with it, not only because monitors and displays work with RGB, but also because Japanese major companies have ignored it for the past twenty years.
The American art museum group, responsible for the report on Benchmarking Art Images Interchange Cycles, came to the conclusion: “In our experimentation, there was no significant difference between AdobeRGB (1998), eciRGB, and ProPhotoRGB color spaces on the relative rankings of the resulting prints. As a result, it is probably best to work in the color space most familiar to your print provider. sRGB may be used for images to be displayed on the web.” In any case: one of the American experiments led to the conclusion that more research is needed concerning questions such as: “What is the disadvantage of using a larger color space than sRGB even if the majority of the content colors fall within sRGB if capture is done at a high number of bits/pixel?” (Frey 2011, p. 111)
The coalition of international associations of photographers that issued the “Universal Photographic Digital Imaging Guidelines (UPDIG) for photographers, designers, printers, and image distributors” suggests: “for high-end printing” a large-gamut space while a narrower-gamut space is fine for reproductions “intended only for consumer-level printing or the web” (UPDIG, p. 13). When taking a picture, the photographer or scanner operator often doesn’t know which colour space the next working environments (e.g. photo labs, printing offices) will use. Therefore it recommends: “[…] you can convert a wide-gamut image to a narrow space such as sRGB, while a narrow-gamut image converted to a wide space will not (re)capture the colors of the wider gamut.” (UPDIG, p. 6) To select one with a wide gamut leaves all directions open, a small one may block numerous of options. On the other hand, the association warns: “While the larger gamut does imply a wider range of image data preserved “down the line,” it also implies bigger image transformations, possibly with bigger shifts in the color of the image, when it is converted to a narrow-gamut color space such as CMYK.” (UPDIG, p. 16; does this refer to “SWOP CMYK”, as “CMYK” is a colour model?) Therefore, before choosing a colour space it should be clear what will be done with the photographs. When a precise answer is not (yet) possible it could help to digitise the slides twice, first with an extensive, then with more limited colour gamut.
When the scanner operator or photographer assigns a colour space during the reproduction process, this information is registered together with the data of the reproduced slide. In UPDIG it is called “tagging” (to mark it with a label that is attached to it). This assures that the file looks the same when leaving the capturing situation and when transferred to the “output space”. During the workflow, the attributed colour space is recognized by each device thanks to an implemented ICC “Profile Connection Space” (see “Colour management with colour profiles” in the technical section) which makes the proper reproduction on viewing apparatus possible.
7. Colour gamut
The human eye can distinguish an incredible amount of colour shades: according to photographer John Hedgecoe (2004, p. 403) it sees more than 16 million different nuances. (The colour tutorial Cambridge (2017) indicates that the eye can differentiate between c. 10 million shades. So the question is: which is more exact?) Nevertheless, a more than 16-Bit per channel would create such a huge data volume that the file would be difficult to handle. The higher the number of levels to represent colour nuances, the bigger the storage space, or the heavier the compression which is needed to cope with the amount of data. Every pixel can be seen as a kind of “reservoir” “filled” with bits to store the information in binary numbers: the more bits to represent nuances in hue, saturation and brightness, the “heavier” the pixel.
Besides, the idea of a restriction came up long before digital binary codes were invented, and since 1931 (when the CIE defined the first colour spaces) a series of standards for colour was agreed on to make sure that interfaces and devices can communicate without error. As already stated, the industry works with several colour spaces which are (often three-dimensional) diagrams. They visualize the spectrum of visible light with all the nuances that are possible within a specific colour model such as the additive RGB for monitors, beamer, still-cameras and scanners, or the subtractive CMYK (cyan, magenta, yellow, black; the K stands for “key” as black is a key colour) for printers.
As the whole visual spectrum of colours is technically not reproducible, technical devices such as computers, cameras, monitors, scanners etc. use a segment precisely defined by the manufacturer: the so-called “gamut” or “colour gamut”: the gamut comprises the section of the colour space that a apparatus can effectively reproduce. Each device has a specific colour range, its particular gamut which can cover a part or the entire colour space the production company works with. The colours which are not present in the company’s gamut (visualized as a triangle inside the colour space) can’t be reproduced, then the colour is said to be “out of gamut”.
As companies generally have their own special gamut, which differs from those of their competitors not only in the choice of the main light bands, but also in brightness and colour saturation, this can create conflicts, e.g. when a web-browser is not able to convert the colours of an image put on the internet by an archive correctly to its own gamut (https://photographylife.com/what-is-white-balance-in-photography) the picture’s look is falsified. This is due to the fact that a colour space is a kind of mapping system where each colour shade has an assigned coordinate. The definition given by a photographic expert website makes clear that quantification is the strategy of the system: “The color space is the environment in which the subjective nature of color is quantified, including the hue, saturation and brightness of color.” (https://aphototeacher.com/2010/02/01/)
Each point on a slide corresponds to a specific group of pixels on the sensor (CMOS: two with a green, one with a red, one with a blue filter) which has captured its light-values four times. In the process of quantification each pixel (or pixel group) is attributed one of the 256 grades of the grey scale which can be 0 (no light registered, all was absorbed) or 96 (dark grey), 190 (light grey) and 255 (light of full intensity, total white). The pixel behind e.g. the blue filter is evaluated and receives the (numerical) value “255” (B=255), the pixels with the red (R=0) and the green filter (G=0) haven’t received any light which expresses that the depicted point of the slide has the coordinates “255-0-0” which is pure blue.
In reproduction, colour is expressed in numerical values such as the above-mentioned “255-0-0”. However this is not as natural as it seems: “Color is caused by the spectral characteristics of reflected or emitted radiance, which makes it seemingly easy to understand as a physical quantity. However, color is really a perceptual quantity that occurs in one’s mind, and not in the world.” (Kim et.al. 2009, p. 1; see also Poynton 1997, p. 3) Colour is the result of the collaboration of eye and brain which makes it a half physical, half psychological phenomenon. Additionally, colour is a human-made concept, a convention agreed on by society: “light situations” are given name such as light-green, orange or purple, although they are just wavelengths and their mixtures. The quantification “255-0-0” is based on the physical constitution of human sight: “The human retina has three types of color photoreceptor cone cells, which respond to incident radiation with somewhat different spectral response curves. […] Because there are exactly three types of color photoreceptor[s], three numerical components are necessary and sufficient to describe a color […].” (Peynton 1997, p. 3) The three types of cones react to each wavelength differently, but always up to a certain extent. This is expressed mathematically: e.g. 97-140-61 indicates that the incoming light provokes a relatively moderate stimulus of the red cones, a stronger one of the green cells and a weak one of the blue receptors. The contribution of the IT-experts – the colour depth with its multiplication of 2x2x2x2x2x2x2x2 = 256 steps per colour channel which sets the extreme limits of dark and bright – should also not be forgotten.
The sensor has already allocated each of the million pixels a distinct position on the grid, defined by its horizontal and vertical coordinates on the array. Each pixel on the sensor corresponds to one in the future digital image. IT-expert Donald P. D’Amato (2000) describes how sensor and image are “mapped”: “The location of each pixel is described by a rectangular coordinate system in which the origin is normally chosen as the upper left corner of the array and the pixels are numbered left-to-right and top-to-bottom, with the upper left pixel numbered (0,0).”
Thanks to quantification each specific picture element is assigned a specific coordinate on the gamut. The array structure and the quantification of the light’s intensity make sure that every picture element will later have a corresponding one on the raster of a computer monitor (or other devices), and that it is simulated in the right colour. It is a fantastic workflow: a small group of picture elements on the scanner captures analogue information from a specific point on the slide which is translated into a digital code which is sent to a processor which regroups it with information from all the other million pixels in a file from which a monitor retrieves it and recreates it on the corresponding position with the help of a small group of light-diodes emitting the equivalent light intensity and the right composition: for 0:0:255 the red and green diodes are dark, only the blue is illuminated in full power.
Old slides can be difficult to reproduce. Scanners and still-cameras are made for modern colours: their sensor filters are sensible to typical wavelengths emitted by modern positives and the gamut is selected according to the needs of photographic paper made to print digitally born images. They work with a colour selection which is better in certain ranges of wavelengths than in others, so they may leave some out, according to what their programming allows or not allows. As the research project DIASTOR has shown for moving picture scanners, colours of old films can fall out of the colour gamut set by the producer of the scanner (Flueckiger et.al. 2016, p. 111). If a colour is “out-of-gamut”, it will be “translated” by recalculation into a nuance that resembles the “out of gamut” shade but is not identical with it. (The same counts for monitors or printers.) Film restorer Paul Read states that “some dyes are outside the range of the recording system” as every dye has its own characteristic spectrum: it has its own peak absorbency and transmission characteristics (Read 2009, p. 27; Ishikawa, Weinert 2010). In his article Read speaks about early film material, but the dyes used in the first decade of cinematography were manufactured by companies that also delivered slide colourers. Thus a certain number of devices are not able to capture all the information on a historical slide, or will reproduce colours which do not correspond to the original.
Glass slides are often hand coloured, especially those from before the mid-19th century, thus some decades before the introduction of animated photography. As the composition changed over time and synthetic aniline dyes came into favour in the 1850s, peak absorbency and transmission characteristics from earlier dyes are different from those found in chromogenic colour processes to tint photographic slides and film prints. Also, pigments in dyes are hardly pure (as a light-band can be) and almost never reflect a monochrome light, but instead mix a preponderant light-band with a range of different spectral components (Aumont 1994, p. 80) which makes a 1:1 simulation on a modern colour gamut precarious. Serial tests could show which dyes are easy or difficult to reproduce with fidelity by today’s devices. And it would be interesting to explore how they could be optimised to capture these qualities.
A professional colour management system is needed for a satisfactory transfer of colour information. It must ensure that data in the colour gamut of a capturing apparatus (e.g. a scanner, camera) can be converted correctly to the specific colour gamut of a reproduction device (e.g. display, monitor, beamer, printer, each with its specific graphic card) so that both present the nuances in a similar way. So-called “colour profiles” help with the communication between the devices. The heart of the workflow is the so-called “Profile Connection Space” (PCS), created by the International Colour Consortium (ICC) in 1993. “The ICC was established for the purpose of “creating, promoting and encouraging the standardization and evolution of an open, vendor-neutral, cross-platform color management system architecture and components […].” (D’Amato 2000) The PSC has to be implemented between scanner, workstation, supplementary computer monitors, laptops etc. The PCS is a device-independent standard colour space which can be shared by the whole workflow. As each apparatus of the work chain has its own gamut and its own colour profile, the PCS has to “translate” between them. D’Amato (2000) speaks of an “[…] intermediary space [which] can be thought of as a common language, with interpreters required only to translate the common language to and from the languages of each of the input and output spaces.”
A colour profile contains conversation tables and works with algorithms to convert colour values from a particular into, or out of, the PCS’s neutral colour space. As it is an independent reference frame, colours can be “translated” without much loss. The coalition of international photographer associations which wrote UPDIG (p. 13-14) mentions: “Assigning profiles can change image appearance without changing the original image data. […] Assigning a profile changes the appearance of an image but not its pixel values. Converting to a different color space does the opposite: It changes the pixel values while attempting to retain appearance.” The risk in this case is high as loss can occur e.g. when the gamut of the sending apparatus is bigger than the one that receives the information.
A file contains pixels with samples of red, green and blue light in three separate colour channels, each expressing the specific shade of the received colour light as a number on the 256 step-grey scale for each colour: e.g. 97-140-61 (which is, according to Tom Striewisch (2009, p. 375) a “medium-lighted subdued green tone”). The three numbers can be considered the “mathematical coordinates” of a colour; as coordinates on a geographic map they designate a specific place in a colour system. With their help the “medium-lighted subdued green tone” can be coupled to the PCS which will attribute it a specific place in its colour space and give it corresponding new “coordinates”. When the same image file is opened by another device, this one reads the new “coordinates” attributed by the ICC colour space to “97-140-61”. Thanks to the “conversion tool” in its “personal” colour profile, this device “knows” where to put this green tone in its own colour space which allows that “the [original] colour impression is kept as much as possible” (Kraus 1998, p. 43-44). Producers referring to the ICC standard indicate in the colour profile of their own hardware in how far it differs from the ICC norm. The information on the divergence is taken into account when reproducing the colour values saved in the file.
An early study on digital artwork reproduction in the US came to the following conclusion: “Significant differences in color quality were found among the tested institutions. There were two main reasons behind this fact: different color sensitivity (i.e., spectral sensitivity) of the systems used, and different approaches to color management. While nothing could be done about the intrinsic color quality of the cameras used (without significant hardware and software changes […]), improving color management routines could improve color quality to a certain degree. The majority of the survey participants categorized themselves within a range from ‘neutral’ to ‘I do not know enough about color management’.” (Berns, Frey 2005, p. 56) The report showed “that implementing digital workflows including color management was a challenge for most cultural-heritage institutions” and that it was “oftentimes implemented while the pressure was ‘on’ to produce a certain number of images.” Colour management has to be taken seriously when organising the workflow, and it should be done before starting the reproduction activities.
Colour management and calibration should be understood as complementary. By implementing a “profile connection space” the correct communication among the devices in the workflow is guaranteed. Calibration ensures “that a device is performing according to a priori specifications that are usually provided by the manufacturer” (D’Amato 2000); it has to has to make sure that each monitor of the production chain uses the right brightness and correct contrast so that each display shows the reproduction as the other ones and true to the original. The UPDIG guidelines (p. 12) add a second aspect – the future viewing conditions: “The purpose of monitor calibration and profiling is to create a situation where the image on your monitor closely matches the image as it will appear on a print, a proof, a press sheet – or, if your work is destined for the web, as viewed on the average un-calibrated PC or Mac monitor.”
Calibration, viewing and room lighting are interconnected. Two evaluation situations have to be controlled or at least foreseen. The in-house viewing conditions can be monitored by implementing standards for the whole workflow to avoid that what looks fine in one room is considered unacceptable in another one. This is called a “metameric effect” or “(illuminant) metameric failure”. Metamerism means that variant combinations of wavelengths stimulate all three cone types of the eye in a way that the different mixtures are perceived as one and the same colour. A “metameric failure” signals that two colours which match under a certain lightning do not match any more when the lighting situation has changed. (This is independent of “observer metamerism”: “Observer metamerism occurs when two spectrally different stimuli viewed under the same light source match to one observer, but do not match to another observer.” (Berns, Frey 2005, p. 42)). As for the external conditions, some may be taken into account by including them into the production chain (e.g. printing), others (e.g. reading files on the home computer monitors) can only be guessed.
Documents accompanying the calibration device and manuals on post production indicate how to implement the software, adjust colour and grey balance and choose the white point of the monitor. They generally insist e.g. on constant and controlled light conditions in the working space, a longer warm-up phase of the monitor, a neutral grey screen background (as multicoloured screen savers distract the eye), no coloured clothing or other bright colour surfaced in the vision field as this may also misguide the cones. Besides, bright coloured shirts may reflect room light and be visible on the screen. In their guidelines (UPDIG, p. 12) the photographic association insists on the correlation between calibration and working environment: “The correct luminance also depends on ambient light conditions. High-end color work should take place in ambient light controlled for color temperature, flare and luminance.” It is essential for the success of the project that the working space,
where the quality of the reproduction has to be evaluated, is seriously thought of and equipped for this purpose, otherwise all the efforts for a good monitor setting can be annulled without the photographer knowing it.
As to concrete parameters, the experience of colleagues can be helpful. So recommends e.g. the report on Benchmarking Art Images Interchange Cycles parameters for the computer monitor: “Gamma: 2.2; White Point: 5000 – 5800K; Brightness: around 120 cd/m²“ (Frey 2011, p. 103) , which differs from the less strict recommendations by UPDIG’s Photographers Guidelines (p. 5). For calibrating a device with an internal light source the “white point”, also called “target white”, is a clearly defined reference element: it is the “whitest” most radiant point of a monitor when illuminated with 100% light power. The white point (thus a point that has no colour at all) is expressed in Kelvin to make it comparable to day light. The “native white point” is the original white point of the uncalibrated monitor and is determined by the manufacturer. To determine the monitor’s specific white point is to align it to what is needed for the work. The UPDIG guidelines (p. 12) indicate that the “desired white point” is at 5000 to 5500K when print production results have to be checked digitally as traditionally the proof of a printed image is checked with a 5000K lightning (D50); the association suggests 6000K to 6500K when the image is reproduced by an ink-jet printer. To choose a white point of 5000 Kelvin will give red and green more influence, at 6500 Kelvin the monitor will more highlight blue and subdue the other primary colours. What to choose depends on the objective. Also, the colour space that was selected may play a role: Adobe RGB (1998) and sRGB both use D65 as reference white point. (For “Gamma” see “The grey scale” in the scanning section. For Brightness in candela per square meter see “Light sources in the working space – some general reflections” in the photographing section.)
Although two research projects give recommendations, the UPDIG rules admit that these indications cannot be applied as a general rule: “There is no single standard for white point, gamma and luminance, because there is no standard for what you are trying to match.” The Benchmarking working group (Frey 2011, p. 11, 31, and passim) has specially tested how different viewing situations change the aspect of the reproduction: it is the room’s lighting (colour temperature and brightness of the lamp) that determines at the end how the spectator’s eye will see the digitally presented or printed image. Franziska Frey (2011, p. 102) and her colleagues resume this as one result of their experiences: “Lighting conditions may have a strong impact on the appearance of a reproduction. Reproductions made under one lighting condition may be a poor representation of the original under another. Files created for a D50 workflow that are modified as a result of evaluation under gallery lighting may produce disappointing results.”
The international association of photographers therefore recommend: “Adjust these [monitor] settings based on viewing conditions. […] monitor luminance should match the appearance of the display prints in the viewing condition.” UPDIG, p. 12) This is possible if the viewing conditions are known which is the case for standard situations such as professional printing where the industry has agreed on norms such as “D50”. For its tests on the influence of viewing conditions on the acceptance of digital and printed results the Benchmarking group combined standard lightning situations and worked with light booths; by this they made a correct evaluation of the digital results next to the original artefact possible. They used “D50” and “D65” to recreate different light situations (Frey 2011, p. 31). D50 stands for lighting at noon of 5000K (Kelvin), D65 for average daylight at 6500K. The International Commission on Illumination / Commission internationale d’éclairage (CIE) has declared both to be a “standard illuminant” and set the denomination “D” to signal different daylight situations. D50 light is white and resembles “horizon” daylight at midday “when the sun is higher than 30° above the horizon” (Myers 2000, p. 6) and is used in graphic art printing. D65 has a moderate blue, is seen as an average daylight experienced in northern countries and is the industrial norm for colour matches of surfaces in the car, furniture or textile production. (The last one, D75 called “North Sky Daylight”, has a moderate to deep blue at 7500 Kelvin.) This helps to avoid “metameric effects” as otherwise the risk is given that the product that, at first glance, was accepted by the client is later rejected.
As to the presentation on the internet where each user has its own viewing circumstances, only a “compromise setting” is applicable. In UPDIG (p. 12-13) is suggested: “Prepare image files for the web on a monitor calibrated and profiled to the sRGB standard gamma of 2.2 and white point of 6500K. This will be a compromise between the uncalibrated Mac gamma standard of 1.8 and the uncalibrated PC standard of 2.4. It is also a reasonable compromise between a prepress monitor calibrated to 5000K and an uncalibrated PC monitor, which may be in the 7300K-9300K range. Modern LCD monitors usually have a native white point around 6500K.”
It is obvious that calibration has to be done regularly, and it is recommended to let always the same person do it (Frey 2011, p. 103). Nevertheless, ideas vary about frequency, basic setting and the best tools that are on the market. When mass scanning is done an error due to erroneous calibration can lead to massive time-consuming digital corrections later on. It is important not to forget that especially laptops often work with reduced light power, to prolong the working span of the accumulator, which makes that colours on the display appear darker than they really are; monitors can also have darker and lighter parts if the photometric brightness is not equal.
The reproduction of colours happens by charging pixels with information about light intensity. Every device has a specific capacity to reproduce colour (its “gamut”) and uses – like cameras, scanners and computer monitors – the additive colour model RGB as all existing colours are a composition of red, green and blue wavelengths in different quantity. When light is composed of the primary three colours – thus three monochrome light bands are added to form a beam – the human eye sees them as “white”. In the subtractive model CMYK, which is always used for printing, the white light beam falls through filters which each absorb a light segment which results in secondary colours, which are a mixture of two primary colours – white minus red leaves cyan (a mixture of green and blue), minus blue gives yellow (green and red mixed), minus green produces magenta (red and blue together). Thus if white light falls on an object and a part is absorbed (e.g. red is “subtracted”) the object reflects the rest of the light and appears cyan, if two colours are “eliminated” (e.g. red and blue), the colour of the artefacts is green. If all the light is absorbed the object appears black because the eye doesn’t receive no more light: it notices the absence of light as black. This observation only functions with mixed light, with pigments for painting other hues are considered primary: yellow, red, blue. Their mixture produces other compositions: violet (red and blue), green (blue and yellow) and orange (red and yellow). Although slides have pigments that absorb parts of the visual spectrum (filter function), the additive system of the pixels receiving the reflected light is decisive.
“Colour depth” (also called “bit depth”) stands for the amount of colour information contained in a digital image. It signals also the number of bits that a pixel has at its disposal to save information on light. The more bits, the more colour nuances can be registered and stored. The colour depth of a digitising device can be e.g. 8-bit (28): each pixel has 8 bits to store the information. A bit (abbreviation for “binary digit”) is the smallest information unit in computer technique. It communicates information the two possible conditions of a pixel, whether it is “on” (has received an electric charge) or “off” (no electricity present). Dennis P. Curtin (2011, p. 32) compares it to a bulb which has also only two states: it is on (light) or it is off (no light). A group of 8 bits is called an “octet” or a “byte” (bite, nip) recalling the incapacity of the early computers to read no more than 8 bits at once (http://www.ulthryvasse.de/). The more bits per pixel, the more it can store information. If it has just one bit (21) it can only state whether there is light (white) or not (black). This is enough for black and white “images” such as a written text. A pixel with two bits (22) can signal black, white, light grey and dark grey. The more bits, the more grey tones can be described as their number increases exponentially, and the more nuances of the original slide can be registered (http://printwiki.org/Scanning).
Colour depth is based on a scale. This scale goes from no colour present (0 is deep pure black) via 254 little steps to full colour present (255 is intense pure white). A 8-bit scale offers 2x2x2x2x2x2x2x2 = 256 light nuances to register each of the three primary colours red, green, blue. Barbara Flueckiger (2003, p. 41) states that a quantification in 8-bit has “[…] theoretically 256 grades of brightness. For historical reasons, it has in practice just 219 steps”. But everybody speaks of 256 shades.
Why 256 possible nuances? As a byte is composed of eight components it has eight zeros and eight ones at its disposal which can be combined in 256 compositions. They are used to express letters and numbers: e.g. the decimal number 0 is indicated in the digital world as 0000, one is 0001, ten is 1010, and the binary code of the decimal number 255 is 11111111. Also, grey values are expressed in binary numbers. An 8-bit colour depth can describe a maximum of 256 different light intensities from the lowest level 0 to the highest possible value 255, for this purpose it uses combinations of zeros and ones. While the digitising device is taking samples of the original, each pixel measures the incoming optical signal and assigns an adequate level on this “ladder”. Depending on the structure, a pixel (or a pixel group) registers the three primary colours (also called “colour channels”). Each of the three colour channels has to have received no light at all (0, 0, 0) to represent pure black or full light for total white (255, 255, 255). In 8-Bit each pixel can combine 3 x 256 possible shades to describe each corresponding point on the artefact, thus has 16.777.216 potential colour combinations. This amount of colour nuances is called “True Colour” or “24 bits per pixel” (224 bits), the mixture of 65.536 shades is denominated “High Colour” or “16 bits per pixel” (216). But generally, devices tend to have a 10-bit sensor which is supposed to differentiate between 1.024, or a 12-bit one with 4.096 shades per colour channel. According to a website on photometrics the colour “components are stored in the indicated order: first Red, then Green, then Blue”. (http://www.awaresystems.be/imaging/tiff/tifftags/photometricinterpretation.html)
CCD and CMOS sensors can “read” the three primary colours, but as their pixels are “colour blind” (Kraus 1998, p. 22) each picture element can only register the intensity of the light that has passed through the filter, expressed in degrees of “greyness”, including white and black. The Foveon X-3 captures all three in one, but also notes only the light’s energy which is expressed in its capacity to produce electric charges. The three colour separations show the same image in different grey nuances as light of a certain wavelength and energy is either let through or “swallowed” by the filter. The other qualities of colour are reproduced on a visualising device. The hue is recreated by “interlacing” the three different grey images to one (in the analogue film world one would say “superimposition”) so that the red, green and blue LEDs can reproduced it on the monitor. Saturation in painting depends on the amount of grey in a colour (at a constant light level): the more grey is present, the less the colour is saturated, up to total loss of colour when the paint is grey and all vividness has disappeared. Or, as Charles Poynton (1997, p. 4) states: “Saturation runs from neutral grey through pastel to saturated colors.” In lighting, the amount of white light present (i. e. light containing all wavelengths) is responsible for the saturation factor of a colour. The purer (the more intense, i. e. concentrated on one wavelength) the colour, the more saturated it is. As a colour space contains saturated and unsaturated colours this information will be taken into account when the digitising apparatus assigns a coordinate on its colour gamut that will be transferred into the colour profile of the monitor. Each of the colour shades which the device can represent has its unique coordinate in the three-dimensional gamut. As to the lightness: the brighter a spot on the original slide, the more the monitor pixel has to sparkle.
As Helmut Kraus (1998, p. 26-27) states in his book, the human eye can’t see the pixels apart from each other, it captures only a general impression. It notices the brightness of certain areas (when the LEDs reproduce the captured light’s high intensity) or their darkness (when the registered light was weak or even missing) and the dominant colour in this section of the image. The screen uses the same technique as the impressionistic painting style called “pointillism”: e.g. a mixture of many illuminated red and blue but only few green screen pixels is recognized as magenta, a kind of lilac.
It is necessary to make a difference between the light “capturing act” and the light “screening act”. A sensor with a resolution of 8-bit per primary RGB-colours can theoretically reproduced 256x256x256 = about 16,7 million different colour values. This seems a lot, but is poor in comparison to the palette nature has to offer. Therefore in “capturing” a colour depth of 10-bit or 12-bit is more common. As for flatbed scanners, the expert Julianna Barrera-Gomez recommends 16-bit (216) per colour (a potential of 65.536 different grades per colour and about 281 trillion shades, produced by the mixing of light-waves) which is excellent for a digital reproduction of hand-painted slides with original dyes which can have a wide range of colours. 16-bit is also recommended in case paper documents related to the slides are reproduced with the same scanner (Barrera-Gomez 2012, p. 3-4). The “screening” device may not be able to reproduce this finesse. Computer monitors normally have a resolution of 8-bit: 256 nuances are poor compared to 65.536 different shades in 16-bit. And nothing at all compared to the human eye, which according to photographer John Hedgecoe (2004, p. 403), has a “colour depth of 32-Bit”.
Although a monitor can’t reproduce the registered colour shades, to have more bits can be very helpful in post production when it comes to images that are too dark. Post production software makes use of a analysing tool: the histogram. A histogram is a diagram that shows statistically (in a needle diagram or a bar chart) how many pixels have the same light intensity. Every needle corresponds to one specific grey value (its specific luminescence) on a grey scale of 256 steps, 1.024 nuances or 4.096 shades. Three histograms (one per colour) indicate whether the reproduction has a rather harmonious range of light values (expressed by regular needle sizes from left to right) or a heterogeneous distribution (strong peaks signalizing an accumulation of certain light values). If the majority of the needles is closer to the left “zero side” (absence of light) the reproductions could be underexposed (if it’s not a night take), if it is nearer to 255 on the right side and has (almost) no needles near to the other end it is probably overexposed.
Therefore, it is important to choose a digitising device with a sufficient “colour depth”. With a higher one it is possible in post production to stretch the diagram to the “poor” side, thus the image can be made brighter or darker without producing annoying gaps (i.e. missing information) between the needles which would be the case with a low colour depth (signalled by jumps in lightness and possibly a stairstepping effect in the transition between two light zones).
Tom Striewisch (2009, p. 50-51) makes it plausible: when a histogram is stretched to cover also the “poor” side the needles (representing light information) change their position on the scale. When an image is underexposed the starting point in the dark (left) corner of the diagram is zero and stays number zero on the scale, but the second light value (“no. 1”) has moved and occupies now the position of the third light grade, the original “no. 2” takes the position of “no. 4”, “no. 3” replaces “no. 6”, “no. 4” is now situated where light shade “no. 8” was before etc. There are gaps where the original values were before they moved to the brighter (right) side: the steps 1, 3, 5, 7 etc. are empty. Was the transition between different zones of light intensity smooth before, now the gaps (missing information) produce jumps which can be observed as graininess. If these gaps can be filled with information due to a higher resolution the transition will become soft again. As higher colour depths (beyond 8-bit) are not correctly reproduced in the diagram – only every 4th (10-bit) or 16th (12-bit) light value can be indicated on the 256 step-scale –, when stretched the graphic is filled with some of the “oppressed” light values, thus the needles still form an uninterrupted pattern (see Kraus 1989, p. 130-133).