Tools Documentation & User Guide

2,746+ posts β€’ 95.4% quality

Overview

This is your complete toolkit for managing Chia news content, built to transition from manual HTML editing to a modern JSON-based system while maintaining your current workflow.

2,746+
Total Posts
129+
Weeks Covered
95.4%
Data Quality
100%
Completion Rate

Project Structure

/mnt/artoo/e/DEV/twic-parser/
|
+-- Data Files
|   +-- chia_news_cleaned.json         # Your main dataset (validated & cleaned)
|   +-- chia_news.json                 # Original parsed data
|   +-- test_data.json                 # Safe testing environment
|   +-- chia_news_validation_report.json
|
+-- Core Tools
|   +-- tweet-converter.js             # Convert single tweets to JSON
|   +-- batch-tweets.js                # Batch processing & interactive mode + HTML gen
|   +-- json-manager.js                # Manage & clean JSON data
|   +-- validate.js                    # Data quality validation
|   +-- cleanup.js                     # Fix data quality issues
|
+-- Development
|   +-- parser.js                      # Original HTML parser
|   +-- index.html                     # Current production site
|
+-- Node Dependencies
    +-- node_modules/                  # (not needed for basic tools)

Tool Reference Guide

πŸ”§ Tweet Converter (tweet-converter.js)

Purpose: Convert individual Twitter/X URLs into your JSON format

⚑ Basic Usage

# Convert single tweet
node tweet-converter.js https://x.com/username/status/1234567890

# With custom JSON file
node tweet-converter.js https://x.com/username/status/1234567890 my_data.json

✨ What It Does

  • Fetches tweet data via Twitter oEmbed API
  • Fallback HTML scraping if oEmbed fails
  • Resolves t.co shortened URLs to real destinations
  • Smart categorization (Space/Video/News/Release/etc.)
  • Extracts author info, mentions, links, topics
  • Auto-detects content type and assigns proper metadata

⚑ Batch Tweet Processor (batch-tweets.js) ⭐ NEW FEATURES

Purpose: Process multiple tweets efficiently with interactive workflow + HTML generation

πŸ”„ Interactive Mode RECOMMENDED

node batch-tweets.js interactive

Workflow:

Paste tweet URLs as you find them
Each gets automatically converted and added
🌐 HTML code generated for current website
Copy/paste HTML directly into your site
Type stats to see current totals
Type quit when done

πŸ§ͺ Test Mode πŸ§ͺ NEW

node batch-tweets.js test

Perfect for:

  • Testing HTML generation without affecting production data
  • Experimenting with different tweet types
  • Verifying formatting before going live
  • Uses separate test_data.json file
🌟 NEW HTML Generation Features:
  • Real-time HTML output for current website format
  • Quote tweet detection - converts URLs to "Quote Tweet" links
  • Smart link formatting (Watch/Listen/Read article/etc.)
  • Category icon display with proper tooltips
  • Author role handling (Hosted by vs regular attribution)
  • HTML entity escaping (prevents double-encoding)
  • Copy/paste ready format matching your current site

πŸ“Š JSON Manager (json-manager.js)

Purpose: Manage, clean, and organize your JSON data

⚑ Quick Commands

# List recent posts
node json-manager.js list 20

# Search posts
node json-manager.js search "chia wallet"

# Remove specific post
node json-manager.js remove 2025-06-17-123

# Remove multiple posts
node json-manager.js bulk 2025-06-17-123,2025-06-16-456

# Show statistics
node json-manager.js stats
πŸ›‘οΈ Safety Features:
  • Automatic backups before saving
  • Changes not saved until you confirm
  • Detailed preview before removal
  • Bulk operations with confirmation

βœ… Data Validator (validate.js)

Purpose: Check data quality and identify issues

# Validate your main dataset
node validate.js chia_news_cleaned.json

# Validate with detailed reporting
node validate.js chia_news.json

πŸ” What It Checks

  • JSON structure integrity
  • Required fields presence
  • Data consistency across weeks
  • Author information completeness
  • Link and mention formatting
  • Category and type validation

🧹 Data Cleanup (cleanup.js)

Purpose: Automatically fix common data quality issues

# Analyze warning patterns
node cleanup.js analyze chia_news.json

# Clean the data automatically  
node cleanup.js clean chia_news.json
πŸ“Š Results: Input: 204 warnings β†’ Output: 23 warnings
πŸ’― Health Score: Improved from 0.0 to 95.4/100

Daily Workflow Recommendations

πŸ“± For Regular News Posting (UPDATED)

⚑ Option 1: Interactive Mode with HTML ⭐ Fastest

cd /mnt/artoo/e/DEV/twic-parser
node batch-tweets.js interactive
  • Paste URLs throughout the day as you find them
  • Copy HTML output directly to current website
  • JSON data builds automatically for future site

πŸ§ͺ Option 2: Test Mode for Experimentation πŸ§ͺ

node batch-tweets.js test
  • Safe environment to test HTML formatting
  • Try different tweet types without affecting production
  • Perfect for learning the system

πŸ“ Option 3: Batch at End of Day

# Collect URLs in a file during the day
# Process all at once:
node batch-tweets.js file todays_tweets.txt
  • Collect URLs in a file during the day
  • Process all at once at the end
  • Good for organized workflow

πŸ”§ For Data Maintenance

πŸ“… Weekly Validation

# Check data health
node validate.js chia_news_cleaned.json

# Clean if needed
node cleanup.js clean chia_news_cleaned.json

πŸ—“οΈ Monthly Cleanup

# Interactive cleanup session
node json-manager.js interactive
# Use: list, search, remove as needed

HTML Generation Features NEW

✨ What Gets Generated

  • 🎯 Category Icons: πŸŒŽπŸš€ with proper tooltips
  • πŸ“ Post Types: Special styling for X Spaces, Releases, etc.
  • πŸ‘€ Author Attribution: "Hosted by" for spaces, regular for others
  • πŸ”— Quote Tweets: Plain URLs β†’ "Quote Tweet" links
  • 🎨 Smart Links: Watch/Listen/Read article based on destination
  • πŸ›‘οΈ Proper Escaping: Prevents HTML entity issues (like —)

πŸ“‹ HTML Output Examples

πŸš€ X Space

<li class='post'><span title="Community">🌎</span> <span title="Space">πŸš€</span> <span style='color:#8C52FE'>X Space</span> - Hosted by <a href='https://x.com/DracattusDev' target='_blank'>@DracattusDev</a> "Weekly Chia discussion". β€’ <a href='https://example.com' target='_blank'>Listen</a> β€’ <a href='https://x.com/source' target='_blank'>Source</a></li>

πŸ“’ Announcement with Quote Tweet

<li class='post'><span title="Chia">🌱</span> <span title="News">πŸ“’</span> <a href='https://x.com/chia_project' target='_blank'>@chia_project</a> announces new partnership! <a href='https://x.com/partner/status/123' target='_blank'>Quote Tweet</a> for details. β€’ <a href='https://x.com/chia_project/status/456' target='_blank'>Source</a></li>

▢️ Video Release

<li class='post'><span title="Community">🌎</span> <span title="Video">▢️</span> <a href='https://x.com/creator' target='_blank'>@creator</a> releases new tutorial video! β€’ <a href='https://youtube.com/watch?v=123' target='_blank'>Watch</a> β€’ <a href='https://x.com/creator/status/789' target='_blank'>Source</a></li>

Data Quality Metrics

πŸ“Š Current Status

2,746+
Total Posts
95.4/100
Health Score
100%
Success Rate

🎯 Quality Indicators

  • 🟒 Green (90-100): Production ready
  • 🟑 Yellow (70-89): Minor issues, still usable
  • πŸ”΄ Red (0-69): Requires cleanup before use

Troubleshooting Guide

πŸ”§ Common Issues & Solutions

❌ "Failed to convert tweet: All fetch methods failed"

  • πŸ” Cause: Twitter API rate limiting or network issues
  • βœ… Solution: Tool automatically retries 2-3 times
  • πŸ› οΈ Manual: Wait 10 seconds, try again
  • πŸ“ Batch Mode: Failed URLs saved to reprocessable log file

⚠️ Quote tweets showing duplicate links

  • πŸ” Cause: Fixed in latest version
  • βœ… Solution: Update to latest batch-tweets.js
  • πŸ“Š Result: Only shows "Quote Tweet" link, not duplicate

⚠️ HTML entities showing as &mdash;

  • πŸ” Cause: Fixed in latest version
  • βœ… Solution: Proper entity handling now implemented
  • πŸ“Š Result: Clean — in HTML output

🌐 Network Issues & Error Recovery

# Check error log files (auto-generated)
ls -la failed_tweets_*.txt

# Reprocess failed URLs
node batch-tweets.js file failed_tweets_2025-06-20T15-30-45-123Z.txt

# Test connectivity
curl -I https://publish.twitter.com
curl -I https://x.com

Advanced Usage

πŸ§ͺ Working with Test Mode

# Start clean testing session
node batch-tweets.js test

# Test with custom file
node batch-tweets.js test my_experiments.json

# Commands available in test mode:
# clear - Reset test data
# stats - View test statistics
# quit - Exit test mode

πŸ”„ Batch Error Recovery

# After batch processing, check for error logs
ls -la failed_tweets_*.txt

# Reprocess failed URLs (file format is ready-to-use)
node batch-tweets.js file failed_tweets_2025-06-20T15-30-45-123Z.txt

# Or manually retry individual URLs
node batch-tweets.js interactive
# Then paste the failed URLs one by one

🌐 Integration with Current Site

πŸš€ NEW WORKFLOW:
  1. Use interactive mode to get both JSON + HTML
  2. Copy HTML to current website immediately
  3. JSON builds automatically for future site
  4. Best of both worlds!

πŸ“Š Data Flow (UPDATED)

Twitter URL β†’ tweet-converter.js β†’ JSON Entry β†’ batch-tweets.js β†’ HTML + JSON
                                                      ↓
                                               Current Website ← Copy/Paste HTML
                                                      ↓
                                            chia_news_cleaned.json ← Future Site Data

⚑ Performance Metrics

Operation Time Notes
Single tweet ~2-3 seconds Including URL resolution
With retries ~4-8 seconds max For problematic URLs
Batch processing ~2-4 seconds per tweet With rate limiting
HTML generation Near-instant No network calls
JSON operations Near-instant In-memory processing
Validation ~1-2 seconds For 2,746+ posts

πŸ—ΊοΈ Development Roadmap

βœ… COMPLETED

  • HTML to JSON parser
  • Tweet converter with URL resolution
  • Batch processing tools
  • Data validation & cleanup
  • JSON management utilities
  • HTML generation for current website
  • Test mode for safe experimentation
  • Retry logic and error handling
  • Quote tweet link conversion

🚧 IN PROGRESS

  • Documentation updates (this file!)

πŸ“‹ PLANNED

  • Twitter List webapp (keyboard-driven workflow)
  • Manual entry tool (web form)
  • Production server deployment
  • Weekly/monthly automated reports

🎯 Success Metrics

You'll know the tools are working well when:

  • Daily posting takes < 30 seconds per tweet
  • HTML generated instantly for current website
  • Data quality stays > 90% health score
  • No manual JSON editing required
  • No manual HTML coding required
  • Backups created automatically
  • Search finds content quickly
  • Failed URLs automatically retried and logged
  • Test mode experiments safely

🌟 What's New in This Version

🌐 HTML Generation System

  • Real-time HTML output matching your current website format
  • Smart quote tweet detection and link conversion
  • Proper HTML entity handling (no more encoding issues)
  • Category icons with tooltips
  • Author role detection (Hosted by vs regular)

πŸ§ͺ Test Mode

  • Safe experimentation environment (test_data.json)
  • All interactive features work without affecting production
  • Clear command to reset test data
  • Visual indicators for test vs production mode

πŸ”„ Retry Logic & Error Handling

  • Automatic retries for network failures
  • Error logging with reprocessable URL files
  • Detailed progress reporting during retries
  • Success/failure statistics with actionable next steps

✨ Quality of Life Improvements

  • Better error messages with specific solutions
  • Automatic backup creation
  • Improved performance with rate limiting
  • Cross-platform compatibility