Over the last year or so I've been writing a simple tool in .NET Core to compare two CSV files. It can be found in on GitHub here. I won't go into details about the tool in this post but being a .NET Core application I've been able to develop it on Windows and also deploy and run on Linux. This post describes how I published and deployed.
I've been using Visual Studio. When I want to create a new deployment I right-click on the Project and select Publish...
I have setup a profile, called "FolderProfile" (Very imaginative!)
- TargetLocation : Folder name
- Configuration : Release | Any CPU (from my solution)
- Deployment mode : Framework-dependent
- TargetFramework : netcoreapp3.0
- TargetRuntime : Portable
total 68 drwxr-xr-x 2 jonathan jonathan 4096 Mar 12 19:27 . drwxr-xr-x 64 jonathan jonathan 4096 Mar 12 19:35 .. -rwx------ 1 jonathan jonathan 431 Jul 2 2020 CSVComparison.deps.json -rwx------ 1 jonathan jonathan 14336 Jul 2 2020 CSVComparison.dll -rwx------ 1 jonathan jonathan 4860 Jul 2 2020 CSVComparison.pdb -rwx------ 1 jonathan jonathan 154 Jul 2 2020 CSVComparison.runtimeconfig.json
I've previously installed the .NET Core runtime on Ubuntu. As of the time of writing the installed version (dotnet --version) is 3.1.404.
The IMDB/movie_data.csv file is a 50000 row CSV I generated from the
IMDB movie dataset. I had originally set this up for a machine learning exercise but it is also a
good sized dataset to checkout the tool. I made two copies and edited one of the comments
on the candidate file. Then, to run the CSV Comparison, I navigated to the folder containing the binaries and ran with this command-line:
dotnet ./CSVComparison.dll ../IMDB/movie_data2.csv ../IMDB/movie_data2.csv ../IMDB/movie_config.xml ./output
The output:
Reference: ./movie_data.csv Candidate: ./movie_data2.csv Saving results to output/ComparisonResults.csv Finished. Comparison took 10517ms
And the result:
?xml version="1.0" encoding="utf-8"?> <ComparisonDefinition xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/20 01/XMLSchema"> <Delimiter>,</Delimiter> <KeyColumns> <Column>row</Column> </KeyColumns> <HeaderRowIndex>0</HeaderRowIndex> <ToleranceValue>0.1</ToleranceValue> <IgnoreInvalidRows>false</IgnoreInvalidRows> <ToleranceType>Relative</ToleranceType> <ExcludedColumns /> </ComparisonDefinition> Date run: 22/03/2021 18:40:12 Reference: ./movie_data.csv Candidate: ./movie_data2.csv Number of Reference rows: 50001 Number of Candidate rows: 50001 Comparison took 10517ms Number of breaks 1 Break Type,Key,Column Name,Reference Row, Reference Value, Candidate Row, Candidate Value ValueMismatch,7230,review,96,"Exceptional movie that handles a theme of
No comments:
Post a Comment