XAS Tutorials

XAS Data Pre-Processing

Once you have successfully installed EXAFSPAK (and an X-Windows server), you are ready to start processing raw XAS data files collected at SSRL. Within the Scott group over the years, many students and postdocs have processed data and we have archived not only the raw data but the processed data as well. For their work to be of use to you (and for your work to be of use to future students and postdocs), we have developed a standardized directory hierarchy within which both raw and processed data files reside. This is described in detail in the XAS Data Handling protocol (pdf download) and you should familiarize yourself with this procedure. In summary, copies of raw data files are brought to your working directory, pre-processing and reduction are carried out there, then copies of the resulting processed files are stored in a defined location. Details of this processing are then logged in our data index and data log forms.

The input raw data files are obtained from XAS-COLLECT, the current data collection software in use at SSRL. These files are binary, but contain single scans of XAS data separated into arrays for monochromator position and individual detector outputs (details below). They have filenames like FX12B_044.001, in which FX12B is a user-assigned filename that collects all the scans of a given run, 044 is the run number that is assigned uniquely by XAS-COLLECT, and 001 is the scan number for this run. In this and subsequent protocols, a generic raw data file will be specified as file_rrr.sss.

Since the raw data files are binary, they cannot be viewed without software designed to read this particular format. EXAFSPAK is a suite of programs designed to perform these functions and the data reduction and analysis. More information is available online (http://www-ssrl.slac.stanford.edu/~george/exafspak/exafs.htm).

MLIST

C:\dir>mlist F*.*

C:\dir>mlist FX12B_044.001 /array

C:\dir>mlist F*.* /find="FdxB"

/array specifies that output should be both header information and a summary of array definitions

/find="string" searches the files specified for any whose header contains "string" (case-sensitive)

[Throughout this and subsequent tutorials, text in this Sans Serif font (and indented) will be comments, while text in Serif font will be text seen in your Command Window. If the text is black, the computer put it there; if it is red, you put it there. C:\dir> represents a generic prompt. In Command Windows, it is assumed that each command is followed by pressing the Enter/Return key.]

This program provides a way to extract information about a given raw data file or files. Minimally this is the header information that was created at data collection time. Another useful set of information is the structure of the arrays of data in the file. There is also a search function that allows the user to find a data file with a given text string in the header.

Use of mlist file_rrr.sss with no qualifiers provides just the header of that file (or files if wildcards are used). Use mlist * to list the headers of all raw data files in the current directory (this works like a dir command but only looks at XAS raw data files). The following screenshot shows a result with no qualifiers as well as the result of using the /array qualifier.

Command Window, mlist examples

MVIEW

C:\dir>mview /num=10-39 /den=4 /dev=2 FSBRA_006.001

C:\dir>mview /num=10-39 FSBRA_006.001

/den defines which channel to use as denominator (default /den=4)

/num defines which channels to use as numerator. A range of channels will be displayed one by one, as num/den.

/dev defines what output channel is used: 2 for terminal display, 8 for PS file, 10 for X-window (and others). Try not specifying /dev first; the default often works.

This program is used to check visually each detector channel of each scan for glitches, anomalies, noise, etc. A given data file contains all detector channels of a single scan. In the above example, Array positions 4, 5, 6 contain I0, I1, I2, while Array positions 10-39 contain 30 channels of "SCA1" (fluorescence data) and Array positions 40-69 contain 30 channels of "ICR" (incoming count rate) for the same 30 detector elements. (The ICR data may or may not be present in the data file.)

As an example, all fluorescence data channels for a given scan can be viewed on a single plot with the following command: mview /num=10-39 fsbra_006.001. [Successful execution of this command (and many others in EXAFSPAK) will establish a new graphics window for display of the result. When finished examining the result, hit the Enter/Return key to return to the CLI terminal emulator (the Command Window). If this new graphics window does not appear, make sure that you have EXAFSPAK installed correctly to use either inherent terminal graphics (e.g., Regis on a VAX) or X-windows and make sure that your X-windows server is operating for the latter and that the System Variables are set properly as discussed in the EXAFSPAK Installation protocol].

X-Window displaying 30 SCA1 channels

It may be hard to compare 30 channels at a time, in which case you can display any range of these channels you like with the /num qualifier. You also can use additional qualifiers /xmin, /xmax, /ymin, /ymax to specify an explicit range of data, rather than the extent of all channels (the default). (It will be useful to leave these off the first time to get a feeling for the appropriate values.)

If there is a question about whether a given channel may contain an anomaly or noise, the ICR channel associated with the same physical detector element is often more sensitive to electrical noise, crystallite diffraction artifacts, etc. These can also be examined, for example with mview /num=40-69 fsbra_006.001:

X-Window displaying 30 ICR channels

With the default assignment (SCA: 10-39; ICR: 40-69), it is useful to remember that for a given physical detector element (el = 1-30), SCA = el + 9 and ICR = el + 39. So the ICR for a given SCA is SCA + 30. Should you decide that a particular SCA channel should be discarded, make a note of it for use later on in the data processing. Be conservative in discarding data; you should be absolutely convinced that a given channel will contribute more noise than signal if included in the weighted average. For example, a channel having an edge jump that appears smaller than the others is not a priori a reason to discard that channel. Also, problems in the ICR trace (e.g., above) should only be used to confirm suspicions about an SCA trace since some of the problems observable in ICR ratio out very well in the SCA/I0 plot.

Another useful trick is that the /num qualifier can have a complicated set of channels specified. For example, in the above plot, you might want to convince yourself that the noisiest channel is in fact channel 62. You can redo this plot leaving out only that channel by using mview /num=40-61;63-69 fsbra_006.001. If you are using Cygwin/X, this command may not work since some Unix shells (e.g., sh, bash) trap semicolons. In that case, precede each semicolon with a backslash: mview /num=40-61\;63-69 fsbra_006.001

MCALIB

C:\dir>mcalib /elem=Fe /edge=K /peak=0.01 FSBRA_006.001

/elem is the two-letter element symbol for the edge being calibrated (default /elem=Fe)

/edge is the edge type (default /edge=K)

/peak gives a lower limit; used to ignore a derivative peak lower than a given fraction of the largest peak (useful for noisy data)

This program calculates the first derivative in an edge region of an internal reference spectrum and searches for first-derivative peak positions as inflection points. Usually, the first peak position in the first derivative of the edge region (the first inflection point) will be used to calibrate the energy scale for experimental data. MCALIB will output a file named CALIBRATE.DAT to accumulate a list of data file calibration positions. This file will be read by MAVE, so this is the first point at which you need to consider the set of individual scans you plan to include in your averaged file. You should calibrate all the files you could imagine including in the average in one session (creating one CALIBRATE.DAT file); when MAVE is used, you can change your mind and exclude some. MCALIB should be run repeatedly on each data file until all files for a given sample (average) have been examined.

For example, mcalib /el=fe fsbra_006.001 generates the following graphical window:

X-Window plotting edge and 1st derivative of calibrant

In this window, if necessary, you can use left or right arrow keys to move the selected peak position (the uniquely colored vertical marker); press the Return/Enter key to confirm your choice (always pick the first inflection point). Then repeat the process for all other data files in your list.

It is worth examining the contents of the resulting CALIBRATE.DAT file. You can display it in the CLI window with the type calibrate.dat command or you can use the NotePad application from a CLI command: notepad calibrate.dat, which generates the following NotePad window:

Notepad Window listing CALIBRATE.DAT file

In the ideal circumstance, all the calibration energy values will be very similar (as in this example). If this is the case, it implies that any differences across scans may simply be arithmetic uncertainties in the peak finder; thus, a single averaged energy should be used to perform the calibration (done in MAVE). If, on the other hand, these numbers are not the same (e.g., a shift in the calibration occurred in the middle of the series), then you may choose to calibrate scans individually (see MAVE) or you may choose to break the series of scans into two groups, that are separately calibrated with the average energy of that group (essentially two executions of MCALIB and MAVE). Then a decision can be made down the line of whether to average these two averaged files together.

MAVE

C:\dir>mave /out=fsbra.ave <input_file_spec>

/out specifies an output filename (default = mave.ave)

<input_file_spec> is a wildcard input file specification to denote a set of input files (if it is not specified, this list will be taken from a CALIBRATE.DAT file; note that this can be a single file)

This program is used to calculate the weighted sum over (30) fluorescence channels (or ln(I0/I1) or any other ratio you choose) and to average all data files into a single averaged data file (now a new format, with extension .AVE by convention). In the most common usage, it obtains the set of raw data files to use as input from a CALIBRATE.DAT file and also calibrates the energy scale using the values read from this file. The output file, *.AVE, is ready for PROCESS (discussed in the XAS Data Reduction protocol). Before running MAVE, confirm that the most recent version of CALIBRATE.DAT in the default directory lists the files you want. REMEMBER to fill in the data log sheet with correct information about each average (see XAS Data Handling Protocol).

An example dialog is provided here, using the CALIBRATE.DAT file created above:

C:\dir>mave /out=fsbra.ave
8 Files found :
1 FSBRA_006.001
2 FSBRA_006.002
3 FSBRA_006.003
4 FSBRA_006.004
5 FSBRA_006.005
6 FSBRA_006.006
7 FSBRA_006.007
8 FSBRA_006.008
Do you want to remove files from this list? [y/n] [N] :y

(in case you have decided that one of the files should not be included in the average)

Enter filenumber to remove [1] :8
8 FSBRA_006.008
remove this file? [y/n] [Y] :
7 Files found :
1 FSBRA_006.001
2 FSBRA_006.002
3 FSBRA_006.003
4 FSBRA_006.004
5 FSBRA_006.005
6 FSBRA_006.006
7 FSBRA_006.007
Do you want to remove files from this list? [y/n] [Y] :n
Do you want to change file weights? [y/n] [N] :

(This would allow you to change from unit weights for every file; very seldom used)

Enter array position for eV [3] :
Enter array position for rtc [1] :
Enter array position range for numerator [10, 39] :

(Be sure that this is the correct range for your data file; use MLIST as above)

Enter array position range for denominator [4, 4] :
Do you want to take logs? [y/n] [N] :

(You would use this if you were analyzing transmission data, for which you want log(I0/I1), so the response for numerator would be 4,4 and for denominator would be 5,5. You can also extract the transmission data for the calibrant by choosing numerator 5,5 and denominator 6,6.)

Do you want to change array position weights? [y/n] [N] :
Do you want to perform deadtime corrections? [y/n] [Y] :n
Statistical weights :
Enter 1 to calculate, 2 to use weights from file, 3 for unit weights [2] :1

[Proper weighting of fluorescence channels adjusts for more or less edge (signal) in channels. See brief discussion in SSRL EXAFSPAK Manual]

Calculate statistical weights :
Enter ev-low, ev-high [7070.00, 7160.00] :

(These are the energy positions just before and just after the edge of interest, respectively)

Do you want to exclude specific array positions? [y/n] [N] :y

(Sometimes some channels are bad and need to be excluded from average)

Enter file-number [0] :1

(You can choose a specific file with a particular bad channel)

1 FSBRA_006.001
Enter array position to exclude [0] :13

(This is the number from the array position range you have specified (10-39 here))

Enter file-number [0] :3
2 FSBRA_030.002
Enter array position to exclude [0] :16
Enter file-number [0] :

(0 ends the channel selection. We have now excluded channel 13 of scan 1 and channel 16 of scan 3. It is more common to exclude an array position from all scans if it shows problems in any scan.)

Recalibration ...
Do you want to shift individual spectra? [y/n] [N] :

(This choice will average energies and perform the same calibration for all scans; y would result in separate calibration of each individual scan)

Enter apparent eV0, true eV0 [7111.06, 7111.30] :

(apparent eV0 is read from CALIBRATE.DAT; true ev0 is looked up as first inflection for this edge. A selected list of values is available in the XAS Element & Energy Table)

Enter d-space, steps per degree [1.92016, 20000.00] :

(These values were inserted into the data file at data collection; they should be correct)

Enter Za, E0 [26.0000, 7130.00] :,7120

(We have chosen to use E0 values -10 eV compared to the lookup table in EXAFSPAK. A selected list can be found in our XAS Element & Energy Table. Notice the use of the initial comma to leave the first default value unchanged)

File FSBRA_006.001
calculating statistical weights ....
File FSBRA_006.002
calculating statistical weights ....
File FSBRA_006.003
calculating statistical weights ....
File FSBRA_006.004
calculating statistical weights ....
File FSBRA_006.005
calculating statistical weights ....
File FSBRA_006.006
calculating statistical weights ....
File FSBRA_006.007
calculating statistical weights ....
File FSBRA_006.008
calculating statistical weights ....

Hit the Return/Enter key to exit the MAVE window. You have now created your first averaged data file. You will be able to see the result when you use PROCESS, the first step of the XAS Data Reduction tutorial.

scott@chem.uga.edu
Department of Chemistry
University of Georgia
Athens, GA 30602-2556
706 542-2240 | FAX: 706 542-2295