I like a good mystery. Like this evening when I saw 130GB space free on my 800GB disk partition with only 280 GB of space being reported as used. Where did all the rest of the disk go?!
First up, I ran WinDirStat as an administrator, no, still reported 280GB. What was going on here?
One cause for this apparent mismatch was System Restore points, which don't appear on Disk Usage reports, so to confirm I went and took a look.
In the search box, type Control Panel and open the Control Panel App:
Open All Control Panel Items > Recovery and select Configure System Restore:
Under the System Protection tab open Configure...
The maximum size for Disk Space Usage was set to the size of the disk and 400GB had already been used...
So, setting the Max Usage back down to 10% and there, I now had a decent amount of free space.
Then on I needed to enter the BIOS (holding Esc on startup for this) and set it to boot from the USB.
After this everything went very smoothly. My PC is connected via Ethernet to the network so nothing was required to setup the internet connection and the windows license was still applied after installation.
Finally, and most fortunately, I had actually made a full backup of the PC only the day before so I hadn't lost any data. The last step to restore was painless.
As to why it failed like this? I don't know. I ran CHKDSK afterwards and nothing untoward was reported. Perhaps a bad Windows update? Although I didn't see any reports of issues like this.
Over many years of working in software development, managing dependencies to you code, whether in binary or source code format has always been a pain, to be honest. Take your pick from:
Anyway, having installed a new Ubuntu partition I wanted to get back the development environment I'd been using for the last couple of years. This is what I did...
Install Visual Studio Code
It's my tool of choice for Python. On Ubuntu I followed these instructions
I then installed the Python and Pylance extensions for VS Code
To check the Python environment being used, Open the Command Palette with Ctrl-Shift-P and then Python : Select Interpreter
Install Anaconda
It has its quirks but I have been using Anaconda. I installed from scratch using the instructions here.
The default installation included all the necessary libraries for simple machine learning code.
Installing Cirq
I've been looking at Quantum Computing over the last year using Cirq. I couldn't get this installed using the instructions here.
However python -m pip install Cirq worked and it appeared in my Anaconda base (root) environment in VS Code. Success!
Installing Pygame
And finally, for some playing around with game programming I installed Pygame with python -m pip install pygame.
So my environment is back up and running on a new OS installation.
My Physics analysis code runs in 3 modes. For 1996 data, for 1997 data and for both year's data. The 3rd mode also runs over different values of cuts. This is so as to build a set of results that can be used to estimate systematic errors in the analysis. For example, changing the value of the cut on Calorimeter energy can provide an indication of how well is is modeled.
To run all these checks a number of "Steering card" files were created. Each one listing out the cuts to be made for a single batch run. The analysis would then run over the entire set of '96/'97 data for each batch in turn and build a list of result ntuples.
Now, while running '96 and '97 individually was successful, when running both, on the second iteration the process crashed with the following.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
In years gone by, I would have taken the approach of writing loads of print(*,*) statements around the "ANALYSE: File contains..." log and eventually homed in on the offending code but surely I can do better now!
It turns out I can. I can add a DEBUG flag: -fcheck=all -g
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
(fcheck=all adds runtime checks for array bounds amongst other checks, -g adds extra information for the gdb debugger)
The line in isr_cuts96.fpp is:
call hf1(510,temp,1.)
This is filling a histogram with data. The fact it's failing on the second iteration of the batch suggests the histogram isn't being correctly cleared down after the first run.
The module I have to do the clearing down is called terminate.fpp
CALLHDELETID Action:Deletes a histogram and releases the corresponding storage space. Input parameter description: IDidentifier of a histogram.ID deletes all existing histograms.
Adding this to terminate.fpp fixed the problem and I am now able to run multiple batches in the same process
call hdelet(0)
Why this ran OK back in 2000 is probably a combination of the older CernLib library and Fortran compiler.
With a couple of projects on .NET Core 3.2 I thought I would update to .NET 6.0. This was pretty simple!
First, update the TargetFramework in the project references to net6.0:
<TargetFramework>net6.0</TargetFramework>
Next, I took the opportunity to update NuGet references to the latest versions. I also cleared the NuGet cache with:
dotnet nuget locals --clear all
At this point Visual Studio 22 Intellisense started warning that references were missing, although the code compiled properly and the Test Runner couldn't fine any of the tests. To resolve this I deleted the .vs folder and restarted.
In the SuperMarketPlanner XAML application, to fix a new compiler warning, I changed the Project SDK in the csproj from
Project Sdk="Microsoft.NET.Sdk.WindowsDesktop"
to
Project Sdk="Microsoft.NET.Sdk"
Finally, in the Azure Pipeline yaml I had to update the SDK to the following:
CERNLib is available as a package on Ubuntu. However I've seen issues around running this on 64bit Linux. So for practicality I decided to setup a 32bit Linux VM. As noted before I opted for Linux Mint on VirtualBox. I picked the 32bit version of Mint 19.3 Tricia.
So the Fortran compiler has moved on a bit since October 2000 which was the last time I ran this code. I needed to make a few changes to successfully compile my source. Fortunately nothing too significant was required.
.EQ. to .EQV.
I was using .EQ. for logical comparison of a LOGICAL variable, for example:
IF (Q2WTFIRST.EQ..TRUE.) THEN
However this should be .EQV. :
IF (Q2WTFIRST.EQV..TRUE.) THEN
Using 1 and 0 for True and False
Related to the above, a compiler error was thrown when using .eq. and 1 or 0 to represent true or false
tltcut = 0
... Change tltcut to 1 if the condition is met
if (tltcut.eq.1) then
Fix was to change the condition to:
if (tltcut) then
DFLIB
At the time Bristol University were trying out Visual Fortran (DEC now Intel Visual Fortran) . We used this library in a Math error handler routine. DFLIB has quite a history but the upshot is it's only supported on Windows. I just took out the routine using it!
Makefile
Using Visual Fortran the build and linker was controlled by a. To build using gfortran in Linux I had to create a Makefile. This is an art in itself! The main issue I had to overcome was linking the CERNLIB libraries. The required libraries below, mathlib, packlib and kernlib were available to the linker with the standard path. pdflib however was in a separate folder that needed including with the -L flag. Some looking around to find the correct name was required, in the end I needed pdflib804.
Include files in Fortran are inserted into the source program. This is useful if you have a common block that you don't want to write over and over, for example defining variables.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Incidentally, the # indicated the file was processed by a C preprocessor rather than the standard Fortran include.
My inc files were mixed in with the Fortran source files. The compiler didn't like this so I updated the folder structure to use separate src/inc/exe folders. Funnily enough this reflected the original folder structure when I ran the code on Unix boxes before we switched to Visual Fortran. The circle is complete!
How to run the Analysis code
This took a bit of remembering. The code was driven by a number of input files.
Steering cards. This contains a list of cuts made to select the events. Different cards would define different cuts and therefore allow me to analyse systematic errors
File listing input Data .rz ntuples
File listing input Background .rz ntuples
File listing input MonteCarlo .rz ntuples
Of course, naming convention wasn't particularly useful, for a run on 1997 data there would be:
stc97_n for the steering cards
fort.45 for the Data file list
fort.46 for the Background file list
fort.47 for the MonteCarlo file list
I should make clear at this point that the files I'm running against are not the "raw" ZEUS datafiles. A preliminary job was run against the full set of data on tape to load a cutdown set of data that passed some loose cuts and create an ntuple with the necessary fields for this particular analysis. My ntuples were saved after this first step.
Results
I ran the code against a single 1997 data, background and MonteCarlo ntuple.
This created, amongst some other files, an israll97.hbook file. By opening PAW and entering hi/file 1 israll.hbook I was able to load this. hi/li listed the available histograms, this looked about right and when I opened one of them with hi/pl 105 success! I had created a histogram of the E-Pz (Total Energy - Momentum in the z, or beampipe direction) from my saved data files. I used this originally to help select the population of events to use in my analysis.
The README.md describes the list of input and output files.
Next steps...
Run the PAW macro (kumac) files I used after this analysis step and add to GitHub.
Run against the full data set. I want to see how quickly this runs on current hardware. To run the full 96/97 data set against all steering cards would take about 20 hours!
Create a GitHub release to include sample .rz files. Hopefully to allow anyone to run this!
Try and build the 64bit version of CERNLIB so I don't have to run on a 32bit VM.
I noticed in some GitHub projects a build status badge in the README.md display. As I have Azure Pipeline builds setup for two of my projects, CSVComparer and SuperMarketPlanner, I thought it would a good idea to add this.
Handily, there is a REST API interface available for the Azure Pipeline build!
The definitionId is the ID of the Pipeline assigned by Azure. You can find this by navigating through to the Pipeline page via https://dev.azure.com/{Organisation}/